Twitter API - Count number of tweets of a specific string

Twitter API - Count number of tweets of a specific string - php

I'm using the twitter api to try to get an integer that tells me how many tweets there are to a certain string I give.
e.g. I search for "mercedes" and then want to get an integer back from twitter that says: "1249". 1249 would mean that there were so many tweets in the last 2 weeks. Twitter only returns data from the last 2 weeks as far as I know. Because of me it's also okay if I get all records back and pull them by means of php or the like. I have already sent some test requests, but always only get arrays back with a maximum of 20 entries.
Anyone have a solution?
And I already looked at similar questions but couldn't find something that helped me. Many answers in the questions I have seen no longer work, as twitter and its api has changed and evolved

Using the public search API, you will get tweets from the last 7 days only and not all tweets. So your results won't be accurate.
If you still want to test, you have to use the standard search API :
https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets.html
Set the "cout" parameter to 100, and check the "next_results" value in the results to loop 100 others tweets and so on until you get no result.

I couldn't find a solution neither, so I coded it using pieces of code and ideas as the previous #JeffProd one, and avoiding using a lib. I hope it could help you.
PS: You must apply for a Twitter Developer Account and create an app to get your TOKENs and KEYs.
<?php
//Access token & access token secret
define("TOKEN", 'XXXXXXXXXXXXXXXX'); //Access token
define("TOKEN_SECRET", 'XXXXXXXXXXXXXXXX'); //Access token secret
//Consumer API keys
define("CONSUMER_KEY", 'XXXXXXXXXXXXXXXX'); //API key
define("CONSUMER_SECRET", 'XXXXXXXXXXXXXXXX'); //API secret key
$method='GET';
$host='api.twitter.com';
$path='/1.1/search/tweets.json'; //API call path
$url="https://$host$path";
//Query parameters
$query = array(
'q' => 'wordtosearch', /* Word to search */
'count' => '100', /* Specifies a maximum number of tweets you want to get back, up to 100. As you have 100 API calls per hour only, you want to max it */
'result_type' => 'recent', /* Return only the most recent results in the response */
'include_entities' => 'false' /* Saving unnecessary data */
);
//time window in hours
define("WINDOW", 1);
//Authentication
$oauth = array(
'oauth_consumer_key' => CONSUMER_KEY,
'oauth_token' => TOKEN,
'oauth_nonce' => (string)mt_rand(), //A stronger nonce is recommended
'oauth_timestamp' => time(),
'oauth_signature_method' => 'HMAC-SHA1',
'oauth_version' => '1.0'
);
//Used in Twitter's demo
function add_quotes($str) { return '"'.$str.'"'; }
//Searchs Twitter for a word and get a couple of results
function twitter_search($query, $oauth, $url){
global $method;
$arr=array_merge($oauth, $query); //Combine the values THEN sort
asort($arr); //Secondary sort (value)
ksort($arr); //Primary sort (key)
$querystring=http_build_query($arr,'','&');
//Mash everything together for the text to hash
$base_string=$method."&".rawurlencode($url)."&".rawurlencode($querystring);
//Same with the key
$key=rawurlencode(CONSUMER_SECRET)."&".rawurlencode(TOKEN_SECRET);
//Generate the hash
$signature=rawurlencode(base64_encode(hash_hmac('sha1', $base_string, $key, true)));
//This time we're using a normal GET query, and we're only encoding the query params (without the oauth params)
$url=str_replace("&","&",$url."?".http_build_query($query));
$oauth['oauth_signature'] = $signature; //Don't want to abandon all that work!
ksort($oauth); //Probably not necessary, but twitter's demo does it
$oauth=array_map("add_quotes", $oauth); //Also not necessary, but twitter's demo does this too
//This is the full value of the Authorization line
$auth="OAuth ".urldecode(http_build_query($oauth, '', ', '));
//If you're doing post, you need to skip the GET building above and instead supply query parameters to CURLOPT_POSTFIELDS
$options=array( CURLOPT_HTTPHEADER => array("Authorization: $auth"),
//CURLOPT_POSTFIELDS => $postfields,
CURLOPT_HEADER => false,
CURLOPT_URL => $url,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_SSL_VERIFYPEER => false);
//Query Twitter API
$feed=curl_init();
curl_setopt_array($feed, $options);
$json=curl_exec($feed);
curl_close($feed);
//Return decoded response
return json_decode($json);
};
//Initializing
$done = false; //Loop flag
$countTweets=0; //Tweets fetched
$twitter_data = new stdClass();
$now=new DateTime(date('D M j H:i:s O Y')); //Current search time
//Fetching starts
do{
$twitter_data = twitter_search($query,$oauth,$url);
//Partial results, updating the total amount of tweets fetched
$countTweets += count($twitter_data->statuses);
//If not all the tweets have been fetched, then redo...
if(isset($twitter_data->search_metadata->next_results)){
//Parsing information for max_id in tweets fetched
$string="?max_id=";
$parse=explode("&",$twitter_data->search_metadata->next_results);
$maxID=substr($parse[0],strpos($parse[0],$string)+strlen($string));
$query['max_id'] = -1+$maxID; //Returns results with an ID less than (that is, older than) or equal to the specified ID, to avoid getting the same last tweet
//Twitter will be queried again, this time with the addition of 'max_id'
}else{
$done = true;
}
}while(!$done);
//If all the tweets have been fetched, then we are done
echo "<p>query: ".urldecode($query['q'])."</p>";
echo "<p>tweets fetched: ".$countTweets."</p>";
?>

Related

Google Cloud Storage paginate objects in a bucket (PHP)

I want to iterate over the objects in a bucket. I REALLY need to paginate this - we have 100's of thousands of objects in the bucket. Our bucket looks like:
bucket/MLS ID/file 1
bucket/MLS ID/file 2
bucket/MLS ID/file 3
... etc
Simplest version of my code follows. I know the value I'm setting into $params['nextToken'] is wrong, I can't figure out how or where to get the right one. $file_objects is a 'Google\Cloud\Storage\ObjectIterator', right?
// temp: pages of 10, out of a total of 100. I really want pages of 100
// out of all (in my test bucket, I have about 700 objects)
$params = [
'prefix' => $mls_id,
'maxResults' => 10,
'resultLimit' => 100,
'fields' => 'items/id,items/name,items/updated,nextPageToken',
'pageToken' => NULL
];
while ( $file_objects = $bucket->objects($params) )
{
foreach ( $file_objects as $object )
{
print "NAME: {$object->name()}\n";
}
// I think that this might need to be encoded somehow?
// or how do I get the requested nextPageToken???
$params['pageToken'] = $file_objects->nextResultToken();
}
So - I don't understand maxResults vs resultLimit. It would seem that resultLimit would be the total that I want to see from my bucket, and maxResults the size of my page. But maxResults doesn't seem to affect anything, while resultLimit does.
maxResults = 100
resultLimit = 10
produces 10 objects.
maxResults = 10
resultLimit = 100
spits out 100 objects.
maxResults = 10
resultLimit = 0
dumps out all 702 in the bucket, with maxResults having no effect at all. And at no point does "$file_objects->nextResultToken();" give me anything.
What am I missing?

The objects method automatically handles pagination for you. It returns an ObjectIterator object.
The resultLimit parameter limits the total number of objects to return across all pages. The maxResults parameter sets the maximum number to return per page.
If you use a foreach over the ObjectIterator object, it'll iterate through all objects, but note that there are also other methods in ObjectIterator, like iterateByPage.

Ok, I think I got it. I found the documentation far too sparse and misleading. The code I came up with:
$params = [
'prefix' => <my prefix here>,
'maxResults' => 100,
//'resultLimit' => 0,
'fields' => 'items/id,items/name,items/updated,nextPageToken',
'pageToken' => NULL
];
// Note: setting 'resultLimit' to 0 does not work, I found the
// docs misleading. If you want all results, don't set it at all
// Get the first set of objects per those parameters
$object_iterator = $bucket->objects($params);
// in order to get the next_result_token, I had to get the current
// object first. If you don't, nextResultToken() always returns
// NULL
$current = $object_iterator->current();
$next_result_token = $object_iterator->nextResultToken();
while ($next_result_token)
{
$object_page_iterator = $object_iterator->iterateByPage();
foreach ($object_page_iterator->current() as $file_object )
{
print " -- {$file_object->name()}\n";
}
// here is where you use the page token retrieved earlier - get
// a new set of objects
$params['pageToken'] = $next_result_token;
$object_iterator = $bucket->objects($params);
// Once again, get the current object before trying to get the
// next result token
$current = $object_iterator->current();
$next_result_token = $object_iterator->nextResultToken();
print "NEXT RESULT TOKEN: {$next_result_token}\n";
}
This seems to work for me, so now I can get to the actual problem. Hope this helps someone.

PHP Twitter API search/tweets GET tweets from last hour ONLY

Hi I have been looking around on the internet and haven't been able to find a solution yet. I want to only get the tweets from the past hour which have a certain hashtag.
I am pulling the tweets in with that hashtag but I dont know how to only get the ones from the past hour.
Here is some example data:
As you can see there is a created_at date there but I dont know how to use this to get the ones from the past hour. I think this would be the only way that I would be able to do it.
The best way I can think of doing it is converting that date into a UNIX timestamp and then checking if it was tweets in the last hour. But there is a lot of data to go through and this doesnt seem like a very good solution but its all I cant think of.
If that is the only solution there is, would some given me an example on how to convert that date to a UNIX timestamp in PHP. If you have a different solution I would love to see a detailed example :) Thanks
You may also find this link useful https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets

I couldn't find an alternative solution neither so I coded it. The WINDOW constant defines the time interval to 1 hour. Hope it helps!
<?php
//Access token & access token secret
define("TOKEN", 'XXXXXXXXXXXXXXXX'); //Access token
define("TOKEN_SECRET", 'XXXXXXXXXXXXXXXX'); //Access token secret
//Consumer API keys
define("CONSUMER_KEY", 'XXXXXXXXXXXXXXXX'); //API key
define("CONSUMER_SECRET", 'XXXXXXXXXXXXXXXX'); //API secret key
$method='GET';
$host='api.twitter.com';
$path='/1.1/search/tweets.json'; //API call path
$url="https://$host$path";
//Query parameters
$query = array(
'q' => 'wordtosearch', /* Word to search */
'count' => '100', /* Specifies a maximum number of tweets you want to get back, up to 100. As you have 100 API calls per hour only, you want to max it */
'result_type' => 'recent', /* Return only the most recent results in the response */
'include_entities' => 'false' /* Saving unnecessary data */
);
//time window in hours
define("WINDOW", 1);
//Authentication
$oauth = array(
'oauth_consumer_key' => CONSUMER_KEY,
'oauth_token' => TOKEN,
'oauth_nonce' => (string)mt_rand(), //A stronger nonce is recommended
'oauth_timestamp' => time(),
'oauth_signature_method' => 'HMAC-SHA1',
'oauth_version' => '1.0'
);
//Used in Twitter's demo
function add_quotes($str) { return '"'.$str.'"'; }
//Searchs Twitter for a word and get a couple of results
function twitter_search($query, $oauth, $url){
global $method;
$arr=array_merge($oauth, $query); //Combine the values THEN sort
asort($arr); //Secondary sort (value)
ksort($arr); //Primary sort (key)
$querystring=http_build_query($arr,'','&');
//Mash everything together for the text to hash
$base_string=$method."&".rawurlencode($url)."&".rawurlencode($querystring);
//Same with the key
$key=rawurlencode(CONSUMER_SECRET)."&".rawurlencode(TOKEN_SECRET);
//Generate the hash
$signature=rawurlencode(base64_encode(hash_hmac('sha1', $base_string, $key, true)));
//This time we're using a normal GET query, and we're only encoding the query params (without the oauth params)
$url=str_replace("&","&",$url."?".http_build_query($query));
$oauth['oauth_signature'] = $signature; //Don't want to abandon all that work!
ksort($oauth); //Probably not necessary, but twitter's demo does it
$oauth=array_map("add_quotes", $oauth); //Also not necessary, but twitter's demo does this too
//This is the full value of the Authorization line
$auth="OAuth ".urldecode(http_build_query($oauth, '', ', '));
//If you're doing post, you need to skip the GET building above and instead supply query parameters to CURLOPT_POSTFIELDS
$options=array( CURLOPT_HTTPHEADER => array("Authorization: $auth"),
//CURLOPT_POSTFIELDS => $postfields,
CURLOPT_HEADER => false,
CURLOPT_URL => $url,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_SSL_VERIFYPEER => false);
//Query Twitter API
$feed=curl_init();
curl_setopt_array($feed, $options);
$json=curl_exec($feed);
curl_close($feed);
//Return decoded response
return json_decode($json);
};
//Initializing
$done = false; //Loop flag
$countCalls=0; //Api Calls
$countTweets=0; //Tweets fetched
$intervalTweets=0; //Tweets in the last WINDOW hour
$twitter_data = new stdClass();
$now=new DateTime(date('D M j H:i:s O Y')); //Current search time
//Fetching starts
do{
$twitter_data = twitter_search($query,$oauth,$url);
$countCalls+=1;
//Partial results, updating the total amount of tweets fetched
$countTweets += count($twitter_data->statuses);
//Searching for tweets inside the time window
foreach($twitter_data->statuses as $tweet){
$time=new DateTime($tweet->created_at);
$interval = $time->diff($now);
$days=$interval->format('%a');
$hours=$interval->h;
$mins=$interval->i;
$secs=$interval->s;
$diff=$days*24 + $hours + $mins/60 + $secs/3600;
if($diff<WINDOW){
$intervalTweets+=1;
}else{
$done = true;
break;
}
}
//If not all the tweets have been fetched, then redo...
if(!$done && isset($twitter_data->search_metadata->next_results)){
//Parsing information for max_id in tweets fetched
$string="?max_id=";
$parse=explode("&",$twitter_data->search_metadata->next_results);
$maxID=substr($parse[0],strpos($parse[0],$string)+strlen($string));
$query['max_id'] = -1+$maxID; //Returns results with an ID less than (that is, older than) or equal to the specified ID, to avoid getting the same last tweet
//Twitter will be queried again, this time with the addition of 'max_id'
}else{
$done = true;
}
}while(!$done);
//If all the tweets have been fetched, then we are done
echo "<p>query: ".urldecode($query['q'])."</p>";
echo "<p>tweets fetched: ".$countTweets."</p>";
echo "<p>API calls: ".$countCalls."</p>";
echo "<p>tweets in the last ".WINDOW." hour: ".$intervalTweets."</p>";
?>

How to get the parameters after an oauth fetch

I'm trying to get some parameters passed in an OAuth fetch.
In the first script, I'm making the Oauth request this way.
$ids = array( 'a' => 1, 'b' => 2);
$oauth = new OAuth("consumer_key","consumer_secret");
$url = $this->getUrlApi();
$oauth->fetch($url,array('ids' => $ids),OAUTH_HTTP_METHOD_POST);
In the second, I'm trying to get the parametrs i've passed in the query. I'm getting an empty parameter.
$ids = $_REQUEST['ids'];
What the wrong thing in my code please. Thanks

Sort the tweets by date using Twitter Search API

I have written an application that searches tweets for a specific keyword using the twitter API, but i am trying to find a way to display the latest tweets first, i am unable to find a way to sort the tweets received as a response.
I am referring to link https://dev.twitter.com/docs/api/1.1/get/search/tweets and below is my code
I have included all necessary files and set all required parameters
function search(array $query)
{
$toa = new TwitterOAuth(CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET);
return $toa->get('search/tweets', $query);
}
$query = array(
"q" => "Sachin Tendulkar",
"count" => 10,
"result_type" => "popular"
);
$results = search($query);
Any help on this would be appreciated. Thanks

To display the latest tweet, you should use result_type as recent.
$query = array(
"q" => "Sachin Tendulkar",
"count" => 10,
"result_type" => "recent"
);
More about result_type paramater :
mixed: Include both popular and real time results in the response.
recent: return only the most recent results in the response.
popular: return only the most popular results in the response.

How to get full list of Twitter followers using new API 1.1

I am using this https://api.twitter.com/1.1/followers/ids.json?cursor=-1&screen_name=sitestreams&count=5000 to list the Twitter followers list, But I got only list of 200 followers. How to increase the list of Twitter followers using the new API 1.1?

You must first setup you application
<?php
$consumerKey = 'Consumer-Key';
$consumerSecret = 'Consumer-Secret';
$oAuthToken = 'OAuthToken';
$oAuthSecret = 'OAuth Secret';
# API OAuth
require_once('twitteroauth.php');
$tweet = new TwitterOAuth($consumerKey, $consumerSecret, $oAuthToken, $oAuthSecret);
You can download the twitteroauth.php from here: https://github.com/elpeter/pv-auto-tweets/blob/master/twitteroauth.php
Then
You can retrieve your followers like this:
$tweet->get('followers/ids', array('screen_name' => 'YOUR-SCREEN-NAME-USER'));
If you want to retrieve the next group of 5000 followers you must add the cursor value from first call.
$tweet->get('followers/ids', array('screen_name' => 'YOUR-SCREEN-NAME-USER', 'cursor' => 9999999999));
You can read about: Using cursors to navigate collections in this link: https://dev.twitter.com/docs/misc/cursoring

You can't fetch more than 200 at once... It was clearly stated on the documentation where count:
The number of users to return per page, up to a maximum of 200. Defaults to 20.
you can somehow make it via pagination using
"cursor=-1" #means page 1, "If no cursor is provided, a value of -1 will be assumed, which is the first “page."

Here's how I run/update full list of follower ids on my platform. I'd avoid using sleep() like #aphoe script. Really bad to keep a connection open that long - and what happens if your user has 1MILL followers? You going to keep that connection open for a week? lol If you must, run cron or save to redis/memcache. Rinse and repeat until you get all the followers.
Note, my code below is a class that's run through a cron command every minute. I'm using Laravel 5.1. So you can probably ignore a lot of this code, as it's unique to my platform. Such as the TwitterOAuth (which gets all oAuths I have on db), TwitterFollowerList is another table and I check if an entry already exists, TwitterFollowersDaily is another table where I store/update total amount for the day for the user, and TwitterApi is the Abraham\TwitterOAuth package. You can use whatever library though.
This might give you a good sense of what you might do the same or even figure out a better way. I won't explain all the code, as there's a lot happening, but you should be able to guide through it. Let me know if you have any questions.
/**
* Update follower list for each oAuth
*
* #return response
*/
public function updateFollowers()
{
TwitterOAuth::chunk(200, function ($oauths)
{
foreach ($oauths as $oauth)
{
$page_id = $oauth->page_id;
$follower_list = TwitterFollowerList::where('page_id', $page_id)->first();
if (!$follower_list || $follower_list->updated_at < Carbon::now()->subMinutes(15))
{
$next_cursor = isset($follower_list->next_cursor) ? $follower_list->next_cursor : -1;
$ids = isset($follower_list->follower_ids) ? $follower_list->follower_ids : [];
$twitter = new TwitterApi($oauth->oauth_token, $oauth->oauth_token_secret);
$results = $twitter->get("followers/ids", ["user_id" => $page_id, "cursor" => $next_cursor]);
if (isset($results->errors)) continue;
$ids = $results->ids;
if ($results->next_cursor !== 0)
{
$ticks = 0;
do
{
if ($ticks === 13)
{
$ticks = 0;
break;
}
$ticks++;
$results = $twitter->get("followers/ids", ["user_id" => $page_id, "cursor" => $results->next_cursor]);
if (!$results) break;
$more_ids = $results->ids;
$ids = array_merge($ids, $more_ids);
}
while ($results->next_cursor > 0);
}
$stats = [
'page_id' => $page_id,
'follower_count' => count($ids),
'follower_ids' => $ids,
'next_cursor' => ($results->next_cursor > 0) ? $results->next_cursor : null,
'updated_at' => Carbon::now()
];
TwitterFollowerList::updateOrCreate(['page_id' => $page_id], $stats);
TwitterFollowersDaily::updateOrCreate([
'page_id' => $page_id,
'date' => Carbon::now()->toDateString()
],
[
'page_id' => $page_id,
'date' => Carbon::now()->toDateString(),
'follower_count' => count($ids),
]
);
continue;
}
}
});
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Twitter API - Count number of tweets of a specific string - php

Related

Google Cloud Storage paginate objects in a bucket (PHP)

PHP Twitter API search/tweets GET tweets from last hour ONLY

How to get the parameters after an oauth fetch

Sort the tweets by date using Twitter Search API

How to get full list of Twitter followers using new API 1.1

Categories

Resources