I am calling different API's on one of my web sites. I am able to get optimal results with multi curl PHP. However, I'm noticing that the speed becomes very slow when traffic is a little high. I have read that caching is another way to speed up websites. However,my question is that can I use caching when the API calls that I am using are entirely dependent on user based inputs? Or is there any alternative solution to this.
It could be possible that maybe 1 request is taking too long to load and as a result delaying other requests.
The answer to your question depends on what kind of task user perform with the data. Basically cache can be used for all tasks related to retrieving, querying data and is not suitable for inserting, mutating or deleting data. There are many way to implement cache in your web application, but one of the easiest way is to use GET request for all user's requests that retrieve data only, and then configure the web server or a CDN to cache them.
Related
I'm working on a web service in PHP which accesses an MSSQL database and have a few questions about handling large amounts of requests.
I don't actually know what constitutes 'high traffic' and I don't know if my service will ever experience 'high traffic' but would optimisations in this area be largely attributed to the server processing speed and database access speed?
Currently when a request is sent to the server I do the following:
Open database connection
Process Request
Return data
Is there anyway I can 'cache' this database connection across multiple requests? As long as each request was processed simultaneously the database will remain valid.
Can I store user session id and limit the amount of requests per hour from a particular session?
How can I create 'dummy' clients to send requests to the web server? I guess I could just spam send requests in a for loop or something? Better methods?
Thanks for any advice
You never know when high traffic occurs. High traffic might result from your search engine ranking, a blog writing a post of your web service or from any other unforseen random event. You better prepare yourself to scale up. By scaling up, i don't primarily mean adding more processing power, but firstly optimizing your code. Common performance problems are:
unoptimized SQL queries (do you really need all the data you actually fetch?)
too many SQL queries (try to never execute queries in a loop)
unoptimized databases (check your indexing)
transaction safety (are your transactions fast? keep in mind that all incoming requests need to be synchronized when calling database transactions. If you have many requests, this can easily lead to a slow service.)
unnecessary database calls (if your access is read only, try to cache the information)
unnecessary data in your frontend (does the user really need all the data you provide? does your service provide more data than your frontend uses?)
Of course you can cache. You should indeed cache for read-only data that does not change upon every request. There is a useful blogpost on PHP caching techniques. You might also want to consider the caching package of the framework of your choice or use a standalone php caching library.
You can limit the service usage, but i would not recommend to do this by session id, ip address, etc. It is very easy to renew these and then your protection fails. If you have authenticated users, then you can limit the requests on a per-account-basis like Google does (using an API key for all their publicly available services per user)
To do HTTP load and performance testing you might want to consider a tool like Siege, which exactly does what you expect.
I hope to have answered all your questions.
I have simple REST api written in CakePHP (php on apache). Basically it has just one endpoint, let's say /api/something/?format=json. Calling this endpoint doesn't read anything from DB, but internally it's fetching and parsing some external website and returns parsed data to the user in json format. The problem is that fetching and parsing data from external web page may last quite long and therefore I need some load balancing mechanizm which will distribute api calls among several servers.
I have never done any load balancing so I even don't know where to look for info - I am looking for the simplest solution.
Is it a resource that has to be fetched live? Because you could cache the processed data for a certain amount of time.
If it has to be live, doing it in a distributed way is probably not going to solve your problem. (except when you're getting back a dataset that is very large)
http://en.wikipedia.org/wiki/Load_balancing_(computing)
Its pretty late but I guess This is what you need ! Just get the hardware to do all the good stuff !
Need some advice on the best approach.
Currently we are going to start a new CI web project where we need to leverage data heavily from a external web-services or API for data?
Is it better to manipulate the data programically (in objects or array) when i need to sort them or store them in database and call them with order, group by etc..?
Is there a known architecture or framework for this?
What's the best approach use nowadays like how aggregater website is doing where they pull many data sources from various vendor API?
I would suggest getting the data using curl etc manipulate as arrays etc then store.
Make sure you build in somekind of caching as well so you don't end up making unnessecary requests.
The reason behind my method is to process once rather than everytime your site is requested.
After all these while, i've have come up with the plan and it's working great !
Consume webservices
Deserialize XML to arrays/ objects
Store in cache (APC/File cache, i'm using codeigniter by the way ) (expire every 4hrs)
First request will take 3-4 secs to complete(first call to webservice to grab data, stored it in cache), while subsequent requests from users take 0.002 secs due to cached data. 4hours later, the cycle will repeat so as to make sure data is 4hourly updated from webservice.
If you are the first user that access the site after each refresh, you are the unlucky chap. But you sacrificed for all other chaps.
I am writing an API for an application, which will be hosted on the Cloud, so that the User's can access it through there unique Application ID's. For the time being it is working all fine and giving User's desired results. Now suddenly the question in which I have stuck is, how to handle multiple request at a time. I need some suggestions through which I can handle multiple requests to the API. Is there a way that I can optimize my code for some fast results to the User. Should I cache the common request of the User's so that I can directly give output to the User from Cached data. Or should I save the latest requested data in database and use indexing to give a fast Output to the User.
Please give suggestions, that I can write a good and fast application and for long run.
Profile your code using xdebug or xhprof.
Identify bottlenecks using real-life evidence, then eliminate or minimize the bottlenecks.
Don't blindly begin caching data under the assumption that it is a performance problem.
Any idea how to implement this (http://fluin.com/63) using MySQL+PHP+Javascript(mootools)?
In a nutshell, it's a realtime threaded conversational web app.
Update:
This uses http://www.ape-project.org/home.html
Any idea how to implement realtime stuff without AJAX push (ape)?
Install Firefox.
Install Web Development toolbar
Install Firebug
Install HttpFox
Read docs of above tools re how to use, what they can do.
Go to http://fluin.com/63. Use above tools to inspect.
Read up on Databases and data models, and MySQL.
Build your own.
Well, this depends on your definition of realtime, which, in its technical meaning, is simply impossible with public ip networks and traditional tcp stack, for you have no control over timing.
Closer to the topic though, to get any web page updated without direct user intervention, you'd have to use javascript to poll server for changes since the last successful poll, and do this over certain intervals of time. In calculating these intervals you'll have to consider both network/server load, and the delay that is comfortable for the user.
The server, of course, will have to store the new data and its timely status (creation timestamps are one way of doing it), to be able to distinguish between content already delivered to various clients.
As soon as the server reports new content, it is inserted into a dom page via javascript and the user sees the response.
This is a bit general, of course, but you should get the idea.
Isn't it like a shoutbox ? here an example of one
Doing this properly using PHP only is very hard. When you have 5 users you could use long-polling, but it will definitely not scale when you have let's say 1000 users.
Using comet with PHP?
The screencast(link) in my post shows how you could implement it, but it has a couple of flaws:
It touches the disc(disc is very slow compared to memory).
To make matters worse it also polls the disc frequently(filemtime()).
Maybe phet(PHP) is able to scale. You should try that out.
To make it scale I think you need at least:
a good implementation of long-polling(at least long-polling. You have better transports) that can handle load.
keep data in memory(much faster than dics) using something like redis or memcached.
I would use:
node.js with socket.io(video) module.
to keep data in memory I would use node_redis(video).