how to practice good ethics while executing many curl requests in php - php

I have done a fair amount of reading on this and I am not quite sure what the correct way to go about this is.
I am accessing a websites api that provides information that I am using on my site. On average I will be making over 400 different API requests which means over 400 curl requests. What is the proper way to make my code pause for an amount of time then continue. The site does not limit the amount of hits on so I will not get banned for just pulling all of the stuff at once, but I would not want to be that server when 10,000 people like me do the same thing. What I am trying to do is pause my code and politely use the service they offer.
What is the best method to pause php execution with resource consumption in mind?
What is the most courteous amount of requests per wait cycle?
What is the most courteous amount of wait per cycle?
With all of these questions I would also like to obtain the information as fast as possible while attempting to stay with in the above questions.
sample eve central API response
Thank you in advance for your time and patience.

Here's a thought: have you asked? If an API has trouble handling a high load, they usually include a limit in their terms. If not, I'd recommend emailing the service provider, explain what you want to do, and ask what they think would be a reasonable load. Though it's quite possible that their servers are quite capable of handling any load you might reasonably want to give it, which is why they don't specify.
If you want to do good by the service provider, don't just guess want they want. Ask, and then you'll know exactly how far you can go without upsetting the people who built the API.
For the actual mechanics of pausing, I'd use the method alex suggested (but has since deleted) of PHP's usleep.

Related

Multiple website makes an API call at the same time

First of all, apology for my bad english.
I just want to clarify things.
If there are multiple website's user (let's say 1000) and try to access my API's endpoint (let say everyone is accessing the registration endpoint) at the same time (or if not exactly the same time, a time-interval of nano-second). What will happen? Will everyone get the same response time? or the first one who access it will get it faster than the second one?
Base on my knowledge (yeah i'm stupid), I think that the API will handle it in queues, so if you're the 1000th user you will receive the response in much longer time. If this is true. Is there a way to lessen the delay?
Thank you very much for your time explaining things :)
You're right about the queue. If there are 1000 people users accessing your API at the same time, some of them will most likely wait.
You can fine-tune the number of simultaneous requests you accept. I assume you're using nginx or apache. For example, you will have to increase the workers and worker processes in nginx as much as possible, but make sure your server can handle them.
If you want to use more servers, you can use a load balancer that will serve the request from the server that's available at the moment or randomly, from one of them.

API Usage Limiting

Are there best practices for storing/limiting API usage? I'm looking to build a basic API using Laravel, but when trying to think about limiting daily API usage I'm getting stuck on the best approach.
Do I log each call in the database and use that to aggregate total API calls for the day to determine if the limit has been reached? What about concurrent API requests? If I want to limit to 1 API call every 5 seconds, is it best to do a database query and determine that?
Any advice would be much appreciated!
This is one of those things that I like to leave up to an existing provider.
To keep things simple, put something in front of your API which has the sole responsibility of rate limiting and controlling API consumers. This is often done with an API proxy of some kind.
3scale is a good solution. It's effectively Nginx with a module for doing all the heavy lifting. http://www.3scale.net/ It's also cheap (or free depending on your load).
There are others out there like Mashery, but frankly I've had terrible luck with Mashery since Intel bought them. DNS resolution issues, skyrocketing prices, etc.

How to tell if someone has left

I was wondering how I could check if someone has left the site/page and perform an action after they left. I was reading on here and I found this:
No, there isn't. The best you can do is send an AJAX request every X seconds (perhaps only if the user moves the mouse). If the server doesn't receive any requests for 2X seconds, assume that the user left.
That's what I had planned for before but how could you make the server do something (in my case it's to remove them from the DB) if they stop sending the request? An example I can think of is how on Facebook when you go to the site you tell them you're here and online which marks you as online in chat but when you leave it marks you as offline. How is that possible?
Edit: After a while of using cron jobs I found out that web hosting sites don't like running cron jobs often enough to generate a "live" feelings on your site. So instead I found node.js and it works a 1000x better and is much simpler. I'd recommend anyone with the same issue to use it, it's very cheap to buy hosting for and it's simple to learn, if you know Javascript you can build in it no problem. Just be sure to know how Async works.
A not uncommon approach is to run a cron job periodically that checks the list of users, and does XYZ if they have been inactive for X minutes.
Facebook uses the XMPP protocol via the Jabber service to have a constant or real-time connection with the user. However, implementing one isn't an easy task at all. The most simple solution would be, as mentioned in the comments, to have the client make AJAX requests to the server every several seconds, so that the server may check whether the user is still viewing the site or not.
You might want to check out my question, which might be related to yours.
The only method I ever remember in my time developing is one that's not 100% reliable as a number of factors can actually and most likely cause it to either misfired, or not even run fully. Up to and including someone disabling JavaScript. Which grant it isn't highly likely with the way websites of today are put together. But people have the option to turn it off, and then people who are acting maliciously tend to have it off as well.
Anyway, the method I have read about but never put much stock in is, the onunload() event. That you tie into something like the <body> tag.
Example:
<body onunload="myFunction()">
Where myFunction() is a bit of JavaScript to do whatever it is your seeking to have done.
Again not superbly reliable for many reasons, but I think in all it's the best you have with almost all client side languages.

I have a nightly batch that fetches newsfeed info from social networks...it takes too long on PHP. Is there a better way?

Basically, as the title says...I have a site that fetches newsfeed information for each account from facebook, Linkedin and twitter. With many accounts, the time it takes to run the nightly batch is verrrryyy looong. Should I not do this in PHP? Is there a better way to run the batch?
Any tips would be appreciated!
The answer to your problem here most likely resides with curl_multi_exec() - this will allow you to make many HTTP requests at once a will likely drastically reduce the amount of time it is taking you to transfer the data.
I might recommend node.js for this, just make the calls, assign the callbacks to handle the response,and when it's done, it's done, you don't have to wait for the replies to move to the next action.
If you want to stick with php, as DaveRandom mentioned, curl_multi_exec() is the way to go.

AJAX/PHP Why is HTTP-Polling so laggy?

Why is HTTP-Polling so laggy?
What I have is a button, and whenever a user clicks it a MySQL database field gets updated and the value is displayed to the user. I'm polling every 800 milliseconds and it's very laggy/glitchy. Sometimes when clicking the button it doesn't register it. And I actually need to be polling quite a bit more frequent than every 800 milliseconds.
This is also with just 1 user on the website at a time... When in the end there is going to be many at once.
HTTP-streaming/Long-polling/Websockets instead of polling
When you need real-time information you should avoid polling(frequently). Below I would try to explain why this is wrong. You could compare it to a child in the back of your car screaming every second "are we there yet" while you are replying "we are not there yet" all the time.
Instead you would like to have something like long-polling/HTTP-streaming or websockets. You could compare this to a child in the back of your car telling you to let him know when "we are there" instead of asking us every second. You could imagine this is way more efficient then the previous example.
To be honest I don't think PHP is the right tool for this kind of applications(yet). Some options you have available are:
hosted solutions:
http://pusherapp.com:
Pusher is a hosted API for quickly,
easily and securely adding scalable
realtime functionality via WebSockets
to web and mobile apps.
Our free Sandbox plan includes up to
20 connections and 100,000 messages
per day. Simply upgrade to a paid plan
when you're ready.
http://beaconpush.com/
Beaconpush is a push service for
creating real-time web apps using
HTML5 WebSockets and Comet.
host yourself:
http://socket.io:
Socket.IO aims to make realtime apps
possible in every browser and mobile
device, blurring the differences
between the different transport
mechanisms
When becoming very big the "host yourself" solution is going to less expensive, but on the other hand using something like pusherapp will get you started easier(friendly API) and also is not that expensive. For example pusherapp's "Bootstrap" can have 100 concurrent connections and 200,000 messages per day for $19 per month(but when small beaconpush is cheaper => do the math :)). As a side-note this plan does not include SSL so can not be used for sensitive data. I guess having a dedicated machine(VPS) will cost you about the same amount of money(for a simple website) and you will also have to manage the streaming solution yourself, but when getting bigger this is probably way more attractive.
Memory instead of Disc
whenever a user clicks it a MySQL
database field gets updated and the
value is displayed to the user
When comparing disc I/O(MySQL in standard mode) to memory it is extremely slow. You should be using an in-memory database like for example redis(also has persistent snapshots) or memcached(completely in memory) to speed up the process. I myself really like redis for it's insane speed, simplicity and persistent snapshots. http://redistogo.com/ offers a free plan with 5MB of memory which will probably cover your needs. If not the mini plan of $5 a month will probably cover you, but when getting even bigger a VPS will be cheaper and in my opinion the prefered solution.
Best solution
The best solution(especially if you are getting big) is to host socket.io/redis yourself using a VPS(cost money). If really small I would use redistogo, if not I would host it myself. I would also start using something like beaconpush/pusherapp because of it's simplicity(getting started immediately). Hosting socket.io(advice to play with it on your own machine for when getting big) is pretty simple, but in my opinion more difficult than beaconpush/pusherapp.
Laggy/glitchy? Sounds like a client-side problem. As does the button thing. I'd get your JavaScript in order first.
As for polling, 0.8 sounds a bit time-critical. I don't know about most countries, but here in the third world simple network packets may get delayed for as long a few seconds. (Not to mention connection drops, packet losses and the speed of light.) Is your application ready to cope with all that?
As for an alternate approach, I agree with #Vern in that an interrupt-driven one would be much better. In HTTP terms, it translates to a long-standing HTTP request that does not receive a response until the server has some actual data to send, minimizing delay and bandwidth. (AFAIK) it's an older technique than AJAX, though has been named more recently. Search for "COMET" and you'll end up with both client- and server-side libraries.
there are many things that might cause the lag that you are experiencing. Your server might be able to process the requests fast enough, but if the connection between your client and the server is slow, then you'll see the obvious lag.
The first thing you should try is to ping the server and see what response time you're getting.
Secondly, rather than poll, you might want to consider an interrupt driven approach. This means that only when your server replies, will you send out your next request. This makes sense, so that many clients won't be flooding the server with requests till the point the server cannot cope. This is especially true, then the RTT (Round-Trip-Time) of your request is pretty long.
Hope it helps. Cheers!
A good place to start would be to use a tool like Firebug in Mozilla Firefox that will allow you to watch the requests being sent to the server and look for bottlenecks.
Firebug will break down each part of the request, so you can see if you are having trouble talking to the server or if it is simply taking a long time to come up with a response.
Along with #Vern's answer I would also say that if at all possible I would have the server cache the data ahead of time and then all of the clients will pull from that same cache and not need separate MySQL calls to reach the same data for every update. Then you just have your PHP update the cache whenever the actual DB data changes.
By cache I mean having php write to a file on the sever side, and then clients will simply look at the contents of that one file to see the most updated info. There might be better ways of caching, but being that I have never done this personally before, this is the first solution that popped into my mind.

Categories