I'm building a Wordpress plugin whose job is to calculate how much each published post should be paid depending on how many visits it registers. It relays on Google Analytics.
Now, when a post is published, it takes some time before it can be paid. Specifically, the post is ready to be paid when its visits counting exceeds a pre-set threshold (that we'll imagine is 100, for the sake of these examples). This means that to know when a post is ready the plugin needs to know if it has scored enough visits since it was published to the current time.
No, suppose we have:
Post A: published 20/07 Post B: published
25/07
The start time for post A in the GA request would be '2013-07-20', but for post B it would be '2013-07-20'. This means that, basically, every post would need its own request, which is unbearable both because the plugin pages would take something like 30 seconds to load AND GA would probably ban it soon. The plugin runs on big blogs as well, with thousand of published posts: even if I did some caching, there is still a lot of data that would need to be loaded fresh from GA.
Any help on how this could be sorted out? Thanks.
Update
After two months, and after Post A and B have already been paid once, we still want to pay the posts that have reached some visits threshold. It wouldn't make sense to ask for all the posts of the blog, it would potentially take forever and return a huge amount of data, so we only looking for posts that have, say, more than 1000 visits since the last payment. Now here comes the problem: the last payment date (which is GA start-date) is not the same for each post. Actually, it is different for each post. How would you cope with such a request?
If you know the start and end dates then why don't you just query for that time period and use the ga:pagePath dimension along with the metric you're after (visits, unique visits, or maybe pageviews). Then you can parse the response to get the metric for each post. For example:
start-date=2013-07-20
end-date=2013-07-25
dimensions=ga:pagePath
metrics=ga:visits,ga:pageviews
(or do unique visits or pageviews if that's what you want)
This will list all page paths during that period with at least 1 visit/pageview.
try the Query Explorer to get an idea of the data you want and the equivalent API query.
Related
I'm looking for coding my own PHP URL shortener. I have already built a system that knows to take a long URL, turn it into a shortened one (something like domain.com/go/URLID) and count the total click activity for it.
I want to add features like:
Daily usage graph (like Google Analytics shows visitor graph in a month).
Unique clicks count.
As I said, the code I made stores the total counts, but I'm not sure how to count unique clicks.
My approach for unique click counts is to use IP or cookies, but I'm not sure which one is more reliable (as cookies may expire and IP will count a full household as repeating clicks). How can I build this?
And the other part of click statistics by day: How can I do it? I was thinking about a very VERY long database table that stores every URL click, but I guess it will be too long, The queries will take time (and I have 300MB table size limit from my server provider).
I would like to get some help with the thing.
I don't mind using external but free services (as long as I can use my own domain, of course).
Thanks!
I've run into a problem while developing a Wordpress plug-in. Basically the API I'm building the plug-in for limits the requests I need to make to 6 per minute, however when the plug-in activates I need to make more than 6 requests to download the API data I need for the plug-in.
The API is the LimelightCRM API (http://help.limelightcrm.com/entries/317874-Membership-API-Documentation). I'm using the campaign_view method of the API, and what I'm looking to do is potentially make the requests in batches, but I'm not quite sure how to approach the problem.
Idea 1:
Just off the top of my head, I'm thinking I'll need to count the number of requests I'll need to make with PHP on plug-in activation, by using campaign_find_active and then divide that count by the request limit (6), and make 6 campaign_view requests per minute until I have all of the data I require and store them in Wordpress transients. However, say I need to make 30 requests, the user can't just sit around waiting 5 minutes to download the data. Even if I manage to come up with a solution for that, it might require me to set the time limits for the Wordpress transients in such a way that the plug-in will never need to make more than 6 requests. So my next thought is, can I use a Wordpress hook to make the requests every-so-often while checking when the last batch of requests was made? So it's already getting very tricky. I wonder if you guys might be able to point me in the right direction. Do you have any ideas on how I might be able to beat this rate limit?
Idea 2:
Cron jobs that store the values in a database?
//Fetch Campaign ID's
$t_campaign_find_active = get_transient('campaign_find_active');
if(!$t_campaign_find_active){
limelight_cart_campaign_find_active();
$t_campaign_find_active = get_transient('campaign_find_active');
return $t_campaign_find_active;
}
//Fetch Campaign Information for each Campaign ID
$llc_cnames = array();
foreach($llc_cids as $count => $id) {
if(!get_transient('campaign_view_'.$id)) {
limelight_cart_campaign_view($id);
$llc_cnames[$id] = get_transient('campaign_view_'.$id);
}
}
//Merge Campaign ID's and Campaign Info into Key => Value array
$limelight_campaigns = array_combine($llc_cids, $llc_cnames);
Note: The functions limelight_cart_campaign_find_active() and limelight_cart_campaign_view() are not included because they simply make a single API request, return the response, and store it in a Wordpress transient. I can include the code if you guys need it, but for the purposes of this example, that part of the plug-in is working so I did not include it.
I've come up with a solution for this guys, and I should have thought of it before. So I've arrived at the conclusion that downloading all of the API data on activation is simply impossible with the current rate limit. Most people who might use the plug-in would have far too many campaigns to download all of their data at once, and it's inevitable that the rate limit will be used up the majority of the time if I keep the code the way it is. So rather than constantly having that API data ready for the plug-in right after activation, I'm going to give the user the ability to make the API calls on demand as needed using AJAX. So let me explain how it will work.
Firstly, on plug-in activation, no data will initially be downloaded, and the user will need to enter their API credentials, and the plug-in will validate them and give them a check mark if the credentials are valid and API log-in was successful. Which uses one API request.
Now rather than having a pre-populated list of campaigns on the "Add Product" admin page, the user will simply click a button on the "Add Product" page to make the AJAX campaign_find_active request which will fetch the campaign ID's and return a drop-down menu of campaign id's and names. Which only uses one request.
After that drop-down data is fetched, they will need to choose the campaign they want to use, and upon choosing the campaign ID the plug-in will display another button to make a campaign_view request to fetch the campaign data associated with the ID. This will return another drop-down menu which will allow them to choose the product. This will also require a little CSS and jQuery to display/hide the AJAX buttons depending on the drop-down values. Which will only use one API request, and because the request is not automatically made and requires a button click the user will not make several API requests when choosing a campaign ID in the first drop-down menu that was fetched.
The user would then click publish, and have a wordpress indexed product with all of the necessary limelight data attached and cached. All API requests will be stored in transients with a 1 hour time limit, and the reason for the hour is so they don't have to wait 24 hours in case they make updates. I will also include a button on the settings page to clear the transients so they can re-download on demand if necessary. That could also get a bit tricky, but for the purposes of this question it's not a problem.
In total, I'm only using 3-4 API requests. I might also build a counter into it so I can display an error message to the user if they use too many requests at once. Something along the lines of "The API's limit of 10 requests per minute has been reached, please wait 60 seconds and try again."
I welcome any comments, suggestions or critiques. Hope this helps someone struggling with API request limits, AJAX is a great way to work around that if you don't mind giving the user a little more control.
I just made 40 API accounts and randomly choose one for each request.. Works well
$api_acounts = array(
"account1" => "asdfasdfdsaf",
"account2" => "asaasdfasdf",
"account3" => "asdfasdf",
);
$rand = rand(1,count($api_acounts));
$username = "account".$rand;
$password = $api_acounts['account'.$rand];
I know the title is complicated, but i was looking for some advise on this and found nothing.
Just want to ask if i'm thinking the right way.
I need to make a top facebook shared page with about 10 items or so for my website items (images, articles etc.)
And this is simple, i will just get the share count from facebook graph api and update in database, i don't want to make it in some ajax call based on fb share, it could be misused.
Every item has datetime of last update, create date and likes fields in database.
I will also need to make top shared url in 24h, 7 days and month so the idea is simple:
User views an item, every 10 minutes the shared count is obtained from fb graph api for this url and updated in database, database also stores last update time.
Every time user is viewing the item, the site checks last update datetime, if it is more than 10 minutes it makes fb api call and updates. It is every 10 minutes to lower fb api calls.
This basically works, but there is a problem - concurrency.
When the item is selected then in php i check if last update was 10 minutes ago or more, and only then i make a call to fb api and then update the share count (if bigger than current) and rest of data, because a remote call is costly and to lower fb api usage.
So, till users view items, they are updated, but the update is depending on select and i can't make it in one SQL statement because of time check and the remote call, so one user can enter and then another, both after 10 minutes and then there is a chance it will call fb api many times, and update many times, the more users, the more calls and updates and THIS IS NOT GOOD.
Any advise how to fix this? I'm doing it right? Maybe there is a better way?
You can either decouple the api check from user interaction completely and have a separate scheduled process collect the facebook data every 10 minutes, regardless of users
Or, if you'd rather pursue this event-driven model, then you need to look at using a 'mutex'. Basically, set a flag somewhere (in a file, or a database, etc) which indicates that a checking process is currently running, and not to run another one.
I work for an online web survey company -- our participants are compensated based on the length of the study they participate in. I'm kind of lazyTo verify the compensation is correct, I would like to track it automatically via cookies to compare with manual counts. Also, every participant might have a different count, so I'd like to be able to more accurately compensate. Time is not very effective since someone could walk away and come back later.
Every survey works like this "participation page".php (domain1) ----> redirect.php(domain2) --> "external survey"(not mine) ----> "endredirect".php (domain2) ---> "Thanks Page"(domain1).
Currently I record the time when you leave the participation page, and when you return to the end page and calculate the number of minutes and estimate how much you should receive.
Is it possible to see a history using a cookie when our participant gets back to the end-page?
If possible I just want to count backwards to the start page.
Thanks
I run a local directory website (think yelp/yell.com etc) and need to provide analytical data to the businesses listed on the site.
I need to track the following:
1) Number of visitors to specific pages (ie: Jim's widgets was viewed 65 times)
2) Number of times a user clicks a link (ie: 25 users clicked to visit your website)
I am able to do this by simply adding one to the relevant number every time an action occurs.
What I would like to be able to do is split this into date ranges, for example, last 30 days, last 12 months, all time.
How do I store this data in the database? I only need the theory, not the code! If someone can explain the best way to store this information, I would be extremely grateful.
For example, do I use one table for dates, one for the pages/links and another for the user data (links clicked/pages visited)? The only solution I have so far is to add a new row to the DB every time one of these actions happens, which isn't going to scale very well.
Thanks to anyone that can help.
I would not reinvent the wheel and use an already available solutions such as Piwik. It can actually read your normal weblogs to provide all the information you asked for.
If for some reason you still need a custom solution, I would not save the tracking data in ranges, but rather use exact time and url-data for each individual page call (what your normal weblog provides). The cumulated data should be generated on-the-fly in your logic section, e.g. through a SQL-view:
SELECT count(url),url
FROM calllog
WHERE calldate > NOW()-30days