I'm trying to pull the count of subscribers for a particular youtube channel. I referred some links on Stackoverflow as well as external sites, came across links like this. Almost all the links suggested me to use youtube gdata api and pull the count from subscriberCount but the following code
$data = file_get_contents("http://gdata.youtube.com/feeds/api/users/Tollywood/playlists");
$xml = simplexml_load_string($data);
print_r($xml);
returns no such subscriberCount. Is there any other way of getting subscribers count or am I doing something wrong?
The YouTube API v2.0 is deprecated. Here's how to do it with 3.0. OAuth is not needed.
1) Log in to a Google account and go to https://console.developers.google.com/. You may have to start a new project.
2) Navigate to APIs & auth and go to Public API Access -> Create a New Key
3) Choose the option you need (I used 'browser applications') This will give you an API key.
4) Navigate to your channel in YouTube and look at the URL. The channel ID is here: https://www.youtube.com/channel/YOUR_CHANNEL_ID
5) Use the API key and channel ID to get your result with this query: https://www.googleapis.com/youtube/v3/channels?part=statistics&id=YOUR_CHANNEL_ID&key=YOUR_API_KEY
Great success!
Documentation is actually pretty good, but there's a lot of it. Here's a couple of key links:
Channel information documentation: https://developers.google.com/youtube/v3/sample_requests
"Try it" page: https://developers.google.com/youtube/v3/docs/subscriptions/list#try-it
Try this ;)
<?php
$data = file_get_contents('http://gdata.youtube.com/feeds/api/users/Tollywood');
$xml = new SimpleXMLElement($data);
$stats_data = (array)$xml->children('yt', true)->statistics->attributes();
$stats_data = $stats_data['#attributes'];
/********* OR **********/
$data = file_get_contents('http://gdata.youtube.com/feeds/api/users/Tollywood?alt=json');
$data = json_decode($data, true);
$stats_data = $data['entry']['yt$statistics'];
/**********************************************************/
echo 'lastWebAccess = '.$stats_data['lastWebAccess'].'<br />';
echo 'subscriberCount = '.$stats_data['subscriberCount'].'<br />';
echo 'videoWatchCount = '.$stats_data['videoWatchCount'].'<br />';
echo 'viewCount = '.$stats_data['viewCount'].'<br />';
echo 'totalUploadViews = '.$stats_data['totalUploadViews'].'<br />';
?>
I could do it with regex for my page , not sure does it work for you or not . check following codes:
<?php
$channel = 'http://youtube.com/user/YOURUSERNAME/';
$t = file_get_contents($channel);
$pattern = '/yt-uix-tooltip" title="(.*)" tabindex/';
preg_match($pattern, $t, $matches, PREG_OFFSET_CAPTURE);
echo $matches[1][0];
<?php
//this code was written by Abdu ElRhoul
//If you have any questions please contact me at info#oklahomies.com
//My website is http://Oklahomies.com
set_time_limit(0);
function retrieveContent($url){
$file = fopen($url,"rb");
if (!$file)
return "";
while (feof ($file)===false) {
$line = fgets ($file, 1024);
$salida .= $line;
}
fclose($file);
return $salida;
}
{
$content = retrieveContent("https://www.youtube.com/user/rhoula/about"); //replace rhoula with the channel name
$start = strpos($content,'<span class="about-stat"><b>');
$end = strpos($content,'</b>',$start+1);
$output = substr($content,$start,$end-$start);
echo "Number of Subscribers = $output";
}
?>
<?php
echo get_subscriber("UCOshmVNmGce3iwozz55hpww");
function get_subscriber($channel,$use = "user") {
(int) $subs = 0;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://www.youtube.com/".$use."/".$channel."/about?disable_polymer=1");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt($ch, CURLOPT_POST, 0 );
curl_setopt($ch, CURLOPT_REFERER, 'https://www.youtube.com/');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:59.0) Gecko/20100101 Firefox/59.0');
$result = curl_exec($ch);
$R = curl_getinfo($ch);
if($R["http_code"] == 200) {
$pattern = '/yt-uix-tooltip" title="(.*)" tabindex/';
preg_match($pattern, $result, $matches, PREG_OFFSET_CAPTURE);
$subs = intval(str_replace(',','',$matches[1][0]));
}
if($subs == 0 && $use == "user") return get_subscriber($channel,"channel");
return $subs;
}
Related
I want to download the google search big image. Below is my code, which is working fine for download small image from google search.
<?php
include_once('simple_html_dom.php');
set_time_limit(0);
$fp = fopen('csv/search.csv','r') or die("can't open file");
$csv_data = array();
while($csv_line = fgetcsv($fp)) {
for ($i = 0, $j = count($csv_line); $i < $j; $i++) {
$imgname = $csv_line[$i];
$search_query = $csv_line[$i];
$search_query = urlencode(trim($search_query));
$html=file_get_html('http://images.google.com/images?as_q='. $search_query .'&hl=en&imgtbs=z&btnG=Search+Images&as_epq=&as_oq=&as_eq=&imgtype=&imgsz=m&imgw=&imgh=&imgar=&as_filetype=&imgc=&as_sitesearch=&as_rights=&safe=images&as_st=y');
$image_container = $html->find('div#rcnt', 0);
$images = $html->find('img');
$image_count = 1; //Enter the amount of images to be shown
$i = 0;
foreach($images as $image){
$srcimg = $image->src;
if($i == $image_count) break;
$i++;
$randname = $imgname.".jpg";
$randname = "Images/".$randname;
file_put_contents("$randname", file_get_contents($srcimg));
}
}
}
?>
Any idea?
This worked for me. simple_html_dom.php wouldn't do the trick since the 'big image' is inside a snippet of JSON near each thumbnail in the DOM.
<?php
$search_query = "Some Keyword"; //change this
$search_query = urlencode( $search_query );
$googleRealURL = "https://www.google.com/search?hl=en&biw=1360&bih=652&tbs=isz%3Alt%2Cislt%3Asvga%2Citp%3Aphoto&tbm=isch&sa=1&q=".$search_query."&oq=".$search_query."&gs_l=psy-ab.12...0.0.0.10572.0.0.0.0.0.0.0.0..0.0....0...1..64.psy-ab..0.0.0.wFdNGGlUIRk";
// Call Google with CURL + User-Agent
$ch = curl_init($googleRealURL);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Linux i686; rv:20.0) Gecko/20121230 Firefox/20.0');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$google = curl_exec($ch);
$array_imghtml = explode("\"ou\":\"", $google); //the big url is inside JSON snippet "ou":"big url"
foreach($array_imghtml as $key => $value){
if ($key > 0) {
$array_imghtml_2 = explode("\",\"", $value);
$array_imgurl[] = $array_imghtml_2[0];
}
}
var_dump($array_imgurl); //array contains the urls for the big images
die();
?>
I think rather than crawling the page you can you the google custom search api
for more details here the url:
https://developers.google.com/custom-search/json-api/v1/overview
I have tried quite a few methods of downloading the page below$url = 'https://kat.cr/usearch/life%20of%20pi/'; using PHP. However, I always receive a page with encrypted characters.
I've tried searching for possible solutions prior to posting, and have tried out a few, however, I haven't been able to get any to work yet.
Please see the methods I have tried below and suggest a solution. I am looking for a PHP solution for the same.
Approach 1 - using file_get_contents - returns encrypted characters
<?php
//$contents = file_get_contents($url, $use_include_path, $context, $offset);
include('simple_html_dom.php');
$url = 'https://kat.cr/usearch/life%20of%20pi/';
$html = str_get_html(utf8_encode(file_get_contents($url)));
echo $html;
?>
Approach 2 - using file_get_html - returns encrypted characters
include('simple_html_dom.php');
$url = 'https://kat.cr/usearch/life%20of%20pi/';
$encoded = htmlentities(utf8_encode(file_get_html($url)));
echo $encoded;
?>
Approach 3 - using gzread - returns blank page
<?php
include('simple_html_dom.php');
$url = 'https://kat.cr/usearch/life%20of%20pi/';
$fp = gzopen($url,'r');
$contents = '';
while($html = gzread($fp , 256000))
{
$contents .= $html;
}
gzclose($fp);
?>
Approach 4 - using gzinflate - returns empty page
<?php
include('simple_html_dom.php');
//function gzdecode($data)
//{
// return gzinflate(substr($data,10,-8));
//}
//$contents = file_get_contents($url, $use_include_path, $context, $offset);
$url = 'https://kat.cr/usearch/life%20of%20pi/';
$html = str_get_html(utf8_encode(file_get_contents($url)));
echo gzinflate(substr($html,10,-8));
?>
Approach 5 - using fopen and fgets - returns encrypted characters
<?php
$url='https://kat.cr/usearch/life%20of%20pi/';
$handle = fopen($url, "r");
if ($handle)
{
while (($line = fgets($handle)) !== false)
{
echo $line;
}
}
else
{
// error opening the file.
echo "could not open the wikipedia URL!";
}
fclose($handle);
?>
Approach 6 - adding ob_start at the beginning of script - page does not load
<?php
ob_start("ob_gzhandler");
$url = 'https://kat.cr/usearch/life%20of%20pi/';
$handle = fopen($url, "r");
if ($handle)
{
while (($line = fgets($handle)) !== false)
{
echo $line;
}
}
else
{
// error opening the file.
echo "could not open the wikipedia URL!";
}
fclose($handle);
?>
Approach 7 - using curl - returns empty page
<?php
$url = 'https://kat.cr/usearch/life%20of%20pi/';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url); // Define target site
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // Return page in string
curl_setopt($cr, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.2 (KHTML, like Gecko) Chrome/5.0.342.3 Safari/533.2');
curl_setopt($ch, CURLOPT_ENCODING , "gzip");
curl_setopt($ch, CURLOPT_TIMEOUT,5);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE); // Follow redirects
$return = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);
$html = str_get_html("$return");
echo $html;
?>
Approach 8 - using R - returns encrypted characters
> thepage = readLines('https://kat.cr/usearch/life%20of%20pi/')
There were 29 warnings (use warnings() to see them)
> thepage[1:5]
[1] "\037‹\b"
[2] "+SC®\037\035ÕpšÐ\032«F°{¼…àßá$\030±ª\022ù˜ú×Gµ."
[3] "\023\022&ÒÅdDjÈÉÎŽj\t¹Iꬩ\003ä\fp\024“ä(M<©U«ß×Ðy2\tÈÂæœ8ž\036â!9ª]ûd<¢QR*>öÝdpä’kß!\022?ÙG~è'>\016¤ØÁ\0019Re¥†\0264æ’؉üQâÓ°Ô^—\016\t¡‹\\:\016\003Š]4¤aLiˆ†8ìS\022Ão€'ðÿ\020a;¦Aš`‚<\032!/\"DF=\034'EåX^ÔˆÚ4‰KDCê‡.¹©¡ˆ\004Gµ4&8r\006EÍÄO\002r|šóóZðóú\026?\0274Š ½\030!\týâ;W8Ž‹k‡õ¬™¬ÉÀ\017¯2b1ÓA< \004„š€&J"
[4] "#ƒˆxGµz\035\032Jpâ;²C‡u\034\004’Ñôp«e^*Wz-Óz!ê\022\001èÌI\023ä;LÖ\v›õ‡¸O⺇¯Y!\031þ\024-mÍ·‡G#°›„¦Î#º¿ÉùÒò(ìó¶³f\177¤?}\017½<Cæ_eÎ\0276\t\035®ûÄœ\025À}rÌ\005òß$t}ï/IºM»µ*íÖšh\006\t#kåd³¡€âȹE÷CÌG·!\017ý°èø‡x†ä\a|³&jLJõìè>\016ú\t™aᾞ[\017—z¹«K¸çeØ¿=/"
[5] "\035æ\034vÎ÷Gûx?Ú'ûÝý`ßßwö¯v‹bÿFç\177F\177\035±?ÿýß\177þupþ'ƒ\035ösT´°ûï¢<+(Òx°Ó‰\"<‘G\021M(ãEŽ\003pa2¸¬`\aGýtÈFíî.úÏîAQÙ?\032ÉNDpBÎ\002Â"
Approach 9 - using BeautifulSoup (python) - returns encrypted characters
import urllib
htmltext = urllib.urlopen("https://kat.cr/usearch/life%20of%20pi/").read()
print htmltext
Approach 10 - using wget on the linux terminal - gets a page with encrypted characters
wget -O page https://kat.cr/usearch/Monsoon%20Mangoes%20malayalam/
Approach 11 -
tried manually by pasting the url to the below service - works
https://www.hurl.it/
Approach 12 -
tried manually by pasting the url to the below service - works
https://www.import.io/
I'm trying to retrieve data from Twitter by connecting to twitter API and make some requests the my code below but I get nothing in return... I just requested the bearer token and successfully received it.
This is the code in PHP:
$url = "https://api.twitter.com/1.1/statuses/user_timeline.json?
count=10&screen_name=twitterapi";
$headers = array(
"GET".$url." HTTP/1.1",
"Host: api.twitter.com",
"User-Agent: My Twitter App v1.0.23",
"Authorization: Bearer ".$bearer_token."",
"Content-Type: application/x-www-form-urlencoded;charset=UTF-8",
);
$ch = curl_init(); // setup a curl
curl_setopt($ch, CURLOPT_URL,$url); // set url to send to
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers); // set custom headers
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // return output
$retrievedhtml = curl_exec ($ch); // execute the curl
print_r($retrievedhtml);
when using the print_r nothing is shown at all and when using the var_dump i find "bool(false)"
Any idea with what could be wrong with this?
Regards,
Try outputting any potential cURL errors with
curl_error($ch);
after the curl_exec command. That might give you a clue about what's going wrong. Completely empty responses usually point to something going wrong with the cURL operation itself.
Your headers are wrong... do not include
"GET".$url." HTTP/1.1"
in your headers.
Further, you may print out the HTTP return code by
$info = curl_getinfo($ch);
echo $info["http_code"];
200 is success, anything in the 4xx or 5xx range means something went wrong.
I built based on comments I found in a Twitter dev discussion by #kiers. Hope this helps!
<?php
// Get Token
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL, 'https://api.twitter.com/oauth2/token');
curl_setopt($ch,CURLOPT_POST, true);
$data = array();
$data['grant_type'] = "client_credentials";
curl_setopt($ch,CURLOPT_POSTFIELDS, $data);
$screen_name = 'ScreenName'; // add screen name here
$count = 'HowManyTweets'; // add number of tweets here
$consumerKey = 'EnterYourTwitterAppKey'; //add your app key
$consumerSecret = 'EnterYourTwitterAppSecret'; //add your app secret
curl_setopt($ch,CURLOPT_USERPWD, $consumerKey . ':' . $consumerSecret);
curl_setopt($ch,CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
curl_close($ch);
$bearer_token = json_decode($result);
$bearer = $bearer_token->{'access_token'}; // this is your app token
// Get Tweets
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL, 'https://api.twitter.com/1.1/statuses/user_timeline.json?count='.$count.'&screen_name='.$screen_name);
curl_setopt($ch,CURLOPT_HTTPHEADER,array('Authorization: Bearer ' . $bearer));
curl_setopt($ch,CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
curl_close($ch);
$cleanresults = json_decode($result);
// Release the Kraken!
echo '<ul id="twitter_update_list">';
foreach ( $cleanresults as $tweet ) {
// Set up some variables
$tweet_url = 'http://twitter.com/'.$screen_name.'/statuses/'.$tweet->id_str; // tweet url
$urls = $tweet->entities->urls; // links
$retweet = $tweet->retweeted_status->user->screen_name; // there is a retweeted user
$time = new DateTime($tweet->created_at); // lets grab the date
$date = date_format($time, 'M j, g:ia'); // and format it accordingly
$url_find = array();
$url_links = array();
if ( $urls ) {
if ( !is_array( $urls ) ) {
$urls = array();
}
foreach ( $urls as $url ) {
$theurl = $url->url;
if ( $theurl ) {
$url_block = ''.$theurl.'';
$url_find[] = $theurl; // make array of urls
$url_links[] = $url_block; // make array of replacement link blocks for urls in text
}
}
}
if ( $retweet ) { // add a class for retweets
$link_class = ' class="retweet"';
} else {
$link_class = '';
}
echo '<li'.$link_class.'>';
$new_text = preg_replace('##([\\d\\w]+)#', '$0', $tweet->text); // replace all #mentions with actual links
$newer_text = preg_replace('/#([\\d\\w]+)/', '$0', $new_text); // replace all #tags with actual links
$text = str_replace( $url_find, $url_links, $newer_text); // replace all links with actual links
echo $text;
echo '<br /><a class="twt-date" href="'.$tweet_url.'" target="_blank">'.$date.'</a>'; // format the date above
echo '</li>';
}
echo '</ul>';
I put together some files on github, named "Flip the Bird." Hope this helps...
I created PHP library supporting application-only authentication and single-user OAuth. https://github.com/vojant/Twitter-php.
Usage
$twitter = new \TwitterPhp\RestApi($consumerKey,$consumerSecret);
$connection = $twitter->connectAsApplication();
$data = $connection->get('/statuses/user_timeline',array('screen_name' => 'TechCrunch'));
Is it possible to pull text data from another domain (not currently owned) using php? If not any other method? I've tried using Iframes, and because my page is a mobile website things just don't look good. I'm trying to show a marine forecast for a specific area. Here is the link I'm trying to display.
Update...........
This is what I ended up using. Maybe it will help someone else. However I felt there was more than one right answer to my question.
<?php
$ch = curl_init("http://forecast.weather.gov/MapClick.php?lat=29.26034686&lon=-91.46038359&unit=0&lg=english&FcstType=text&TextType=1");
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
$content = curl_exec($ch);
curl_close($ch);
echo $content;
?>
This works as I think you want it to, except it depends on the same format from the weather site (also that "Outlook" is displayed).
<?php
//define the URL of the resource
$url = 'http://forecast.weather.gov/MapClick.php?lat=29.26034686&lon=-91.46038359&unit=0&lg=english&FcstType=text&TextType=1';
//function from http://stackoverflow.com/questions/5696412/get-substring-between-two-strings-php
function getInnerSubstring($string, $boundstring, $trimit=false)
{
$res = false;
$bstart = strpos($string, $boundstring);
if($bstart >= 0)
{
$bend = strrpos($string, $boundstring);
if($bend >= 0 && $bend > $bstart)
{
$res = substr($string, $bstart+strlen($boundstring), $bend-$bstart-strlen($boundstring));
}
}
return $trimit ? trim($res) : $res;
}
//if the URL is reachable
if($source = file_get_contents($url))
{
$raw = strip_tags($source,'<hr>');
echo '<pre>'.substr(strstr(trim(getInnerSubstring($raw,"<hr>")),'Outlook'),7).'</pre>';
}
else{
echo 'Error';
}
?>
If you need any revisions, please comment.
Try using a user-agent as shown below. Then you can use simplexml to parse the contents and extract the text you want. For more info on simplexml.
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=>"User-agent: www.example.com"
)
);
$content = file_get_contents($url, false, stream_context_create($opts));
$xml = simplexml_load_string($content);
You may use cURL for that. Have a Look at http://www.php.net/manual/en/book.curl.php
So I'm trying to grab some images from another site, the problem is each image is on a different page
IE: id/1, id/2, id/3 etc etc
so far I have the code below which can grab an image from the single URL given using:
$returned_content = get_data('http://somedomain.com/id/1/');
but need to make the line above become an array (I guess) so it will grab the image from page 1 then go on to grab the next image on page 2 then page 3 etc etc automatically
function get_data($url){
$ch = curl_init();
$timeout = 5;
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$returned_content = get_data('http://somedomain.com/id/1/');
if (preg_match_all("~http://somedomain.com/images/(.*?)\.jpg~i", $returned_content, $matches)) {
$src = 0;
foreach ($matches[1] as $key) {
if(++$src > 1) break;
$out = $key;
}
$file = 'http://somedomain.com/images/' . $out . '.jpg';
$dir = 'photos';
$imgurl = get_data($file);
file_put_contents($dir . '/' . $out . '.jpg', $imgurl);
echo 'done';
}
As always all help is appreciated and thanks in advance.
This was pretty confusing, because it sounded like you were only interested in saving one image per page. But then the code makes it look like you're actually trying to save every image on each page. So it's entirely possible I completely misunderstood... But here goes.
Looping over each page isn't that difficult:
$i = 1;
$l = 101;
while ($i < $l) {
$html = get_data('http://somedomain.com/id/'.$i.'/');
getImages($html);
$i += 1;
}
The following then assumes that you're trying to save all the images on that particular page:
function getImages($html) {
$matches = array();
$regex = '~http://somedomain.com/images/(.*?)\.jpg~i';
preg_match_all($regex, $html, $matches);
foreach ($matches[1] as $img) {
saveImg($img);
}
}
function saveImg($name) {
$url = 'http://somedomain.com/images/'.$name.'.jpg';
$data = get_data($url);
file_put_contents('photos/'.$name.'.jpg', $data);
}