How to log php actions to html readable file? - php

I am making some kind of plugin for wordpress and lately i have a bit of problems, and im not sure if it's plugin related or not.
The plugin is made to pull videos and their description, tags, thumb, etc...
So when i type in search term in my plugin and hit enter, the code goes to youtube search page,search video and pull data from it.
The problem is related to not pulling videos every time when i search. So sometime it works, sometime it doesn't and it doesn't matter if it's same search terms or not.
Here's an example of the code, it's a bit long so ill just set search terms in variable instead in a form.
$searchterms = 'funny cat singing';
$get_search = rawurlencode($searchterms);
$searchterm = str_replace('+', ' ', $get_search);
$getfeed = curl_init();
curl_setopt($getfeed, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($getfeed, CURLOPT_URL, 'https://www.youtube.com/results?search_query='.$searchterm.'');
curl_setopt($getfeed, CURLOPT_RETURNTRANSFER, true);
curl_setopt($getfeed, CURLOPT_CONNECTTIMEOUT, 20);
$str = curl_exec($getfeed);
curl_close($getfeed);
$feedURL = str_get_html($str);
foreach($feedURL->find('ol[id="search-results"] li') as $video) {
get info like thumb time etc...
}
So sometime as i said i get the videos updated, and sometime i don't
How can i record actions in log file so i can have or know what's happening when i press search.
Something like
Pulling videos
Search terms: https://www.youtube.com/results?search_query=funny+cat+singing
And than if i get response from youtube something like
Page found, pulling videos.
Or if page is not found
Page not found, didn't get response from youtube.
If page is found than next step is to see if search term actually returns something, etc...
If i only know the basic how to start with logging, i will customize it later based on criteria what info i need to log.
Any advices?

You may try out one of these two tutorials
http://www.devshed.com/c/a/php/logging-with-php/
http://www.hotscripts.com/blog/php-error-log-file-archive/

Related

Dynamic OG meta data for Facebook

I have a small web page that, every day, displays a one word answer - either Yes or No - depending on some other factor that changes daily.
Underneath this, I have a Facebook like button. I want this button to post, in the title/description, either "Yes" or "No", depending on the verdict that day.
I have set up the OG metadata dynamically using php to echo the correct string into the og:title etc. But Facebook caches the value, so someone sharing my page on Tuesday can easily end up posting the wrong content to Facebook.
I have confirmed this is the issue by using the Facebook object debugger. As soon as I force a refresh, all is well. I attempted to automate this using curl, but this doesn't seem to work.
$ch = curl_init();
$timeout = 30;
curl_setopt($ch, CURLOPT_URL, "http://developers.facebook.com/tools/lint/?url={http://ispizzahalfprice.com}");
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$data = curl_exec($ch);
curl_close($ch);
echo $data;
Am I missing some easy fix here? Or do I need to re-evaluate my website structure to acheive what I am looking for (e.g. use two separate pages)?
Here's the page in case it's useful: http://ispizzahalfprice.com
Using two separate URL's would be the safe bet. As you have observed, Facebook does quite heavy caching on URL scrapes. You've also seen that you, as the admin of the App, can flush and refresh Facebook's cache by pulling the page through the debugger again.
Using two URL's would solve this issue because Facebook could cache the results all they want! There will still be a separate URL for "yes" and one for "no".

facebook fan page user data extraction php

To extract list of users of a particular facebook fan page am using the below code
$text = file_get_contents('rawnike.php');
// $text = file_get_contents('http://www.facebook.com/plugins/fan.php?connections=10000&id=15087023444');
$text = preg_replace("/<script[^>]+\>/i", "", $text);
$text = preg_replace("/<img[^>]+\>/i", "", $text);
$pattern = '!(https?://[^\s]+)!'; // refine this for better/more specific results
if (preg_match_all($pattern, $text, $matches)) {
list(, $links) = ($matches);
//print_r($links);
//var_dump($links);
}
unset($links[0]);unset($links[1]);unset($links[2]);unset($links[3]);unset($links[4]);unset($links[5]);unset($links[6]);unset($links[7]);
//var_dump($links);
$links=str_replace('https','http',$links); $links=str_replace('\"','',$links);
foreach ($links as $value) {
echo "fb user ID: $value<br />\n";
}
And by this am successfully retrieving users' profile links using file_get_contents('rawnike.php') (rawnike.php locally saved)
but if I try to pull the same from url file_get_contents("http://www.facebook.com/plugins/fan.php?connections=10000&id=15087023444") am not able to retrieve, which means I cannot extract facebook page's source directly! I should save the page's source manually!
The same I observed when parsing a user's page if I manually stores page's source code locally and parse it, am able to extract user's interest. On the other hand if I directly try to extract source code with URL, its not getting the same source.
Which means $source=file_get_contents($url); $source="content which displays ur browser doesnt supported or some crap" on other hand $source=file_get_contents($string_to_extract_content_of_local_saved_sourceFile); $source="content which i excatly needed to parse"
On doing little research I understood that FQL is right approach for doing things like this. But pls help me understand why there is difference in sources code extracted and is FQL is the only way or in some other way I can proceed ahead.
But pls help me understand why there is difference in sources code extracted
Because Facebook realizes by looking at the details of your HTTP request, stuff like the User Agent header etc., that it’s not a real browser used by an actual person making the request – and so they try to block you from accessing the data.
One can try to work around this, by providing request details that make it look more like a “real” browser – but scraping HTML pages to get the desired info is generally not the way to go, because –
and is FQL is the only way or in some other way I can proceed ahead.
– that’s what APIs are for. FQL/the Graph API are the means that Facebook provides for you to access their data.
If there is data you are interested in, that is not provided by those – then Facebook does not really want to give you that data. The data about persons who like a page is such kind of data.
<?php
$curl = curl_init("https://www.facebook.com/plugins/fan.php?connections=10000&id=15087023444");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20100101 Firefox/15.0.1");
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
$data = curl_exec($curl);
curl_close($curl);
$data = preg_replace("%(.*?)(<div id.*?>)%is","",$data); //to strip <scripts>,<links>,<meta>,etc tags.
But the max connections are 100. :S
The number of connection params cannot exceed 100, you are trying with 1000.

Facebook Real time updates

i am creating an application via which a user shares a specific post on the facebook wall or the user's timeline page. This is done via the Javascript sdk and Facebook graph api.
I want to know is that i need to collect all the comments and the likes on that shared post whose id i store in the database.
then i run a cron which uses the graph api again to get the posts and comments on a specific feed (id from db) on facebook.
but i want to know is, that, is there any way for a real time update. Like if someone comments on the feed it send a request to my link and that link saves / update the comment in my database.
if not, let me know that is my cron thing the best way to do this. or is there another way for it
Facebook does indeed give you the ability to get real-time updates, as discussed in this document.
According to this document how ever, it doesn't look like you can get updated about the comments/likes of a post, you can only get updates to specific fields/collections of the User object, not a specific post.
There is no such ability to upadate it in real time, you may do it with cron or do update comments, likes count upon refresh button..
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $POST_URL);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.1) Gecko/20100101 Firefox/10.0.1");
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
$file_content = curl_exec($ch);
curl_close($ch);
if ($file_content === false) {
//post was delete or something else
} else {
$post_data = json_decode($file_content, true);
}
in $POST_URL you type: https://graph.facebook.com/+POST_ID
in $post_data['likes']['count'] you will have likes count
in $post_data['comments']['count'] you will have comments count

PHP CURL not working on add to basket

I am trying to CURL this URL so that it automatically adds a product to a basket
http://www.juno.co.uk/cart/add/440551/01/
When I follow the URL in the browser it adds the product to basket
When I CURL it it doesnt add it
This is my CURL code
$url = "http://www.juno.co.uk/cart/add/440551/01/";
$c = curl_init();
curl_setopt($c, CURLOPT_URL,"$url");
$file_path = 'cookies.txt';
curl_setopt($c,CURLOPT_POST,true);
curl_setopt($c, CURLOPT_CONNECTTIMEOUT, 50);
curl_setopt($c,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($c, CURLOPT_RETURNTRANSFER,1);
curl_setopt($c, CURLOPT_COOKIEJAR, $file_path);
$complete = curl_exec($c);
curl_close($c);
Any ideas? CURL is definitely set up on my server as I am successfully using it for other scripts.
You can see the output here http://soundshelter.net/addjuno.php?id=440551 - it is redirecting to the page that I expect it to (i.e. adding the item to basket) but I do not want to redirect the user to this page - only ping the page so that the item is added to basket but the user remains on my page. Any ideas?
Thanks in advance
The cart (or something about it (id, content, etc) is stored in a session, you have to create a custom function in which you can pass the id of the cart, and you can update it.
EDIT:
if this would be possible, then it would be a security risk (add items to anybody cart ?)
user is identified by session id, you need to "steal" it from your visitor and call the url via curl like you were the user (you can create cookies for the curl session i think and set the session id), but of course this is a very similar thing like stealing cookie / session datas, and there are defending techniques against it
my opinion is only one possible solution is, if the juno.co.uk has a public api for such operations
Answer may be as simple as you shouldn't need to POST, that might be causing problems since you aren't sending/specifying any data. What I mean is to comment out that line:
//curl_setopt($c,CURLOPT_POST,true);
sidebar: Can you show the output that you do get?

php: Fetch google first result

I had this code that help me fetch the URL of an actor page on IMDB by searching "IMDB+Actor name" and givng me the URL to his IMDB profile page.
It worked fine till 5 minutes ago and all of a sudden it stopped working. Do we have a daily limit for google queries (would find it very strange!) or did I alter something on my code without noticing (in this case can you spot what's wrong?) ?
function getIMDbUrlFromGoogle($title){
$url = "http://www.google.com/search?q=imdb+" . rawurlencode($title);
echo $url;
$html = $this->geturl($url);
$urls = $this->match_all('/<a href="(http:\/\/www.imdb.com\/name\/nm.*?)".*?>.*?<\/a>/ms', $html, 1);
if (!isset($urls[0]))
return NULL;
else
return $urls[0]; //return first IMDb result
}
function geturl($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 5.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1");
$html = curl_exec($ch);
curl_close($ch);
return $html;
}
function match_all($regex, $str, $i = 0)
{
if(preg_match_all($regex, $str, $matches) === false)
return false;
else
return $matches[$i];
}
They will, in fact, throttle you if you make queries too fast, or make too many. For example, their SOAP API limits you to 1k queries a day. Either throw in a wait, or use something that invites this kind of use... such as Yahoo's BOSS. http://developer.yahoo.com/search/boss/
ETA: I really, really, like BOSS, and I'm a Google fangirl. It gives you a lot of resources and clean data and flexibility... Google never gave us anything like this, which is too bad.
There is an API for the search for Google and it is limited to 100 queries/day! And it is not allowed to fetch Google search results with any kind of automatic tool, according to the G guidelines.
Google's webpage is designed for use by humans; they will shut you out if they notice you heavily using it in an automated way. Their Terms of Service are clear that what you are doing is not allowed. (Though they no longer seem to link directly to that from the search results page, much less their front page, and in any case AIUI at least some courts have upheld that putting a link on a page isn't legally binding.)
They want you to use their API, and if you use it heavily, to pay (they aren't exorbitant).
That said, why aren't you going directly to IMDb?

Categories