Twitter API: accessing public replies and retweets - php

I'm trying to sample the relative frequencies of regular tweets vs retweets vs replies in the public timeline; however, I can't seem access the latter two. Is there any way to pull down public replies and retweets using the Twitter API? (for the record, I'm using PHP, but I think this is more of an API question) Or, alternatively, is there any way to empirically determine the relative fractions of retweets/replies/neither that exist on Twitter?
Edit: I should have made this explicit, and I apologize: the problem resides in the fact that the API seems to have eliminated replies and retweets from the statuses/public_timeline REST call, leaving only regular tweets. My question, then, is whether or not there is a way to access public replies and retweets in addition to regular tweets, given that this particular method call does not seem to work. I hope that clears things up.

If you would like to get those frequences for live status updates, you could consume Twitter's Streaming API. Since you want to use PHP, Phirehose allows you to consume the streams easily.
From there you would just examine each status update, looking for whatever markers you like to determine whether they are retweets, replies, etc.. Twitter Text (PHP) might be useful (even if you just borrow the regexes).
A quick run of the above against the "Sample" stream (~1% of public statuses) showed:
2,986 replies
1,481 retweets
6,706 mentions
10,000 tweets

You can extract that information parsing the tweets text looking for "RT #XXXX" or "#XXXX" patterns.

Related

Collect data over a period

I want to collect data of a particular keyword for the last seven days without user authentication on twitter. The problem is that the result set for one day itself was more than 3000. This quickly blocks my app due to rate limitation. I need a work around this. In fact I don't need the data, I just need the count for each day ( probably this is not possible). Could you please advise me to get over the same. I am using search api, and I am open to use any api.
One more question: Is it possible to collect the public posts at regular intervals ( all posts, without a query term). If this is possible then I can save them in my database and perform the search on the same.
This sounds like a job for the streaming API. You can think of it as setting a keyword and opening a firehose where you will receive tweets containing your keyword until you close the firehose connection. The streaming API is designed for persistent connections, tracking a limited number of keywords. You login with basically a default user.
This 140 PHP Development Framework is a great help in working with the Twitter streaming API in PHP.
Resources:
Twitter Streaming API Information -
https://dev.twitter.com/docs/streaming-apis
140 Twitter Streaming API Framework -
http://140dev.com/free-twitter-api-source-code-library/

Twitter Threaded Conversation

I've been wanting to create something like this: http://twitter.theinfo.org/
A script which finds replies to a tweet and shows them in a threaded fashion like this:
http://twitter.theinfo.org/45967981225840640
Any help on where to start or if there's an implementation already out there for me to tinker with?
Going up the thread is easy because replies have in_reply_to_statu_id but finding replies to a status is near impossible. You have have to maintain a search looking for tweets to a specific user and check if they are a reply in which case save them.
What I would do if i had to do that, is run a cron that would store the in_reply_to_status_id field, and then query that.
You could theoretically expect that only his followers will reply to him, if you'd like a starting point
You can use the Twitter OAuth API in numerous languages however what type of data you can get is limited to the API calls they provide.
What you're probably looking for is something like their retweets call;
http://dev.twitter.com/doc/get/statuses/retweets/:id
You may want to take a look at the Twitter API Wiki;
http://apiwiki.twitter.com/w/page/22554648/FrontPage
as well as their Tutorials listing;
http://apiwiki.twitter.com/w/page/22554678/Tutorials

Making a Twitter Bot in PHP?

I've made a twitter bot using the CURL library for PHP, and a MySQL db.
But I would like to expand its functionality by automatically retweeting tweets marked with a certain hashtag.
I do not know quite how to do this, so would any of you be willing to point me to a learning resource on how to do this?
Or if you really want to, show me how to add this functionality?
Thanks!
What you need is a daemon. You have to make it search every time for the hashtag you are looking for (check search documentation).
Then you'll have to retweet.
And perhaps, you shall save each retweet in a Data Base, so your daemon won't duplicate tweets.

Tweet Contest logic ( Twitter )

Disclaimer: I have no Twitter API
experience nor have I used Twitter
until today
I've been given the task of creating a 'tweeting contest' - if anyone has Twitter API experience and/or has done this in the past, I would appreciate any useful tips that you may have.
So the basic rules are that in order for a user to enter the contest, said user must follow the contest's twitter and must retweet with a specific message, such as 'just entered a contest for http://foo.com/contest'.
Questions:
To get the entrants, I have to parse the rss feed of the contest, http://twitter.com/statuses/user_timeline/21586418.rss seems to only list the last few posts so I would probably have to interact with the Twitter API in order to get all messages. Can someone recommend documentation or a page that covers this?
I'm not exactly sure if I should store the actual users in a local xml file or rely on querying the Twitter API, if I store them I would have a cache local copy of users... a database would be overkill and if I were to store them it would be better off in an xml file, right?
Related to #1, should I actually parse for the exact message which the user has to tweet, eg "just entered a contest", the exact string when I parse through the data feed of all the tweets? Or is there some sort of tagging system I can use?
Related to #1, I would have to determine whether the user is a follower or not, so I can't determine that by parsing an entry/tweet, I would have to query the user's id and grab statistics from the people he/she follows?
You could search for the URL, but the best approach would be to use a hashtag:
just entered #supercoolcontest for http://foo.com/contest
You can search for incidences of #supercoolcontest which contain the required contest URL or whatever other keywords you might want. This will ensure users don't have to be text-precise when retweeting, and also gives people a way to talk about the contest in a general way that is trackable.
You can pull all tweets with a hashtag by using the search API:
http://search.twitter.com/search.json?q=%23supercoolcontest
This is probably the most efficient approach, since you are guaranteed to only pull the tweets you're interested in, instead of n tweets from n users, only a tiny fraction of which has anything to do with you.
Every time you scrape that API feed (every n minutes), insert new unique users. I'd use a database - not hard or time consuming to stand something up with a table or two. Easier to query against later.
To answer your last question, you do need to make a separate API call to determine if a given user follows another user.
I know this is an old question and is probably not relevant to meder anymore, nonetheless I want to comment that now there is another way to solve this problem using Twitter's Streamming API http://dev.twitter.com/pages/streaming_api the advantage of this approach is that you are telling twitter to send all the tweets that accomplish some conditions right when they are generated.
With the search API you need to poll twitter for new tweets all the time and there is a bigger chance that some of them will be missing from the search results; meanwhile with the streaming API you keep an open connection to twitter and process the tweets as they come, Twitter won't guarantee that you will get all the tweets that meet the conditions, but from my experience the risk is much lower.

Find Users based on Twitter Friends

I have an app where I pull in tweets with a certain hash tag. When I find the hash tag the app automatically creates a user if they don't exist. When the user logs in via Twitter, I want be able to present them with their friends which are also using the app. The problem is for Twitter users with a ton of friends there is a max response of 100 and I'd have to continue to hit the API to 10 times to get the users of someone with 1000 friends.
Also, when pulling the friends info, should I just cache the friends in an array and move to a matched array so I don't have to hit the API again?
Given that most Twitter apps have a per hour limit on API calls you really should cache pretty much everything. Check the cache to see if you have the data first before pulling down any information.
If you are worried about how up-to-date the data is then put a time stamp in the cache. When you try to access something from the cache check whether the time difference to now is larger than some defined amount (depending on how fresh your data needs to be & how much you can keep hitting the server with requests) and if it is go and refresh the data.
This is a little like writing a good web crawler (which Jeff Atwood seems to suggest has only been done by Google). It is easy to write something that will attempt to pull down everything from the internet at once but it is more difficult to write something that will do it in a sustainable, manageable way.
Twitter have been sensible in forcing people to think through these issues by placing a "per-hour access count" on their API.
I found an API call that just returns the IDs of a Twitter user's friends and returns upwards of 5000, however, tries to return all. The docs for the call are here: http://apiwiki.twitter.com/Twitter-REST-API-Method:-friends%C2%A0ids
What I did was took the response from the API call and created a SQL statement utilizing IN. This way, I now can handle all my sorting and so forth via SQL, rather than doing a nasty array compare.

Categories