Which is the best approach to consume webservices and manipulate its data? - php

Need some advice on the best approach.
Currently we are going to start a new CI web project where we need to leverage data heavily from a external web-services or API for data?
Is it better to manipulate the data programically (in objects or array) when i need to sort them or store them in database and call them with order, group by etc..?
Is there a known architecture or framework for this?
What's the best approach use nowadays like how aggregater website is doing where they pull many data sources from various vendor API?

I would suggest getting the data using curl etc manipulate as arrays etc then store.
Make sure you build in somekind of caching as well so you don't end up making unnessecary requests.
The reason behind my method is to process once rather than everytime your site is requested.

After all these while, i've have come up with the plan and it's working great !
Consume webservices
Deserialize XML to arrays/ objects
Store in cache (APC/File cache, i'm using codeigniter by the way ) (expire every 4hrs)
First request will take 3-4 secs to complete(first call to webservice to grab data, stored it in cache), while subsequent requests from users take 0.002 secs due to cached data. 4hours later, the cycle will repeat so as to make sure data is 4hourly updated from webservice.
If you are the first user that access the site after each refresh, you are the unlucky chap. But you sacrificed for all other chaps.

Related

laravel, vuejs and 3rd party api's

I am writing an small web page at the moment that will consume a 3rd party API and process the data and display the return processed data in a table, a user will be able to change the data query based via a form input.
A couple of questions I have,
1) PHP seems like a redundant language here I can do ajax requests in vuejs?
1a) However I would like to be able to cache the 3rd party data, so if user selects the same query twice I don't need to go off an fetch it again, this seems like good practice?
1b) Or would it be better to cache the results page, and show that when a repeat request is made?
I am also using this excercise to start writing test for my PHP is is possible to write tests for 3rd party APIs?
The answer depends on wether you need caching or not. Keep in mind that ajax requests are sent by the browser and therefore don't cost you any server resources. Caching is only really necessary if the third party api you are using can't handle a huge amount of requests.
If you've decided you need caching, you'll have to access the api via your backend, in your case that means using php. Of course you could also write your own api dispatcher / cache in something like NodeJS and use this as a microservice, but that sounds overly complicated for a small project.
In my opinion you are best off to just access the api via ajax in vue, it'll save resources and will be the easiest way to go, everything else seems just redundant.
Testing a third party api can be tricky and in your case is probably redundant. What you rather want to test is how your application integrates with the api. You also probably want to write a mock for that api so that you can run your tests without depending on the api.

Increase speed of dynamic API calls using PHP

I am calling different API's on one of my web sites. I am able to get optimal results with multi curl PHP. However, I'm noticing that the speed becomes very slow when traffic is a little high. I have read that caching is another way to speed up websites. However,my question is that can I use caching when the API calls that I am using are entirely dependent on user based inputs? Or is there any alternative solution to this.
It could be possible that maybe 1 request is taking too long to load and as a result delaying other requests.
The answer to your question depends on what kind of task user perform with the data. Basically cache can be used for all tasks related to retrieving, querying data and is not suitable for inserting, mutating or deleting data. There are many way to implement cache in your web application, but one of the easiest way is to use GET request for all user's requests that retrieve data only, and then configure the web server or a CDN to cache them.

How to efficiently construct HTML templates when all the datas come from an internal API?

Here's the context : we're actually using the basic web stack, and our website builds HTML templates with datas it gets from the database directly.
For tons of reasons, we're splitting this into two projects, one will be responsible for talking with the database directly, the other one will be responsible for displaying the datas.
To make it simple, one is the API, the other one is the client.
Now we're wondering about how we should ask our API for datas. To us, there are 2 totally different options :
One request, one route, for one page. So we would get a huge object to use which would contain everything needed to build the corresponding page.
One request for one little chunk of data. For example on a listing page, we'd make one request to get datas about the current logged user and display its name along with its avatar, then another request to get every articles, another request to get datas about the current page category...
Some like the first option, I don't like it at all. I feel like we're going to have a lot of redundance. I'm also not sure one huge request is that much faster than X tiny requests. I also don't like binding data to a specific page, as I feel like the API should be (somewhat) independant from our front website.
Some also don't like the second option, they fear we overcharge the server by making too many calls, and I can understand this fear. It also looks like it'll be hard to properly define the scope of what to send, what to not send without any redundancy. If we're sending only what's needed to display a page, isn't that the first option in the end ? But isn't sending unneeded information a waste ?
What do you guys think ?
The first approach will be good if getting all data is fast enough. The less requests - the faster app. Redundancy I think you mean code redundancy because sending the same amount of data in one request will be definitely faster than in 10 small non-parallel ones (network overhead). If you send a few parallel requests from UI you can get performance gain of cause. And you should take into account that browsers have some limitations for parallel requests.
Another case if getting some data is fast but another is slow you can return the first data and on UI show loading image and load the second data when it will come. It will improve user experience showing the page as fast as possible.
The second approach is more flexible as you can use some requests from other pages. But it comes with price - logic with making these requests (gathering information) you need to move to UI code making it more complex. And if you need the same data on another app like mobile you have to copy this logic. As a rule creating such code on backend side is easier.
You can also take a look at this pattern which allow you to locate business/domain logic inside one service and “frontend friendly” logic to another service (orchistration service).

How to persist a MongoDB cursor in between requests?

In the contest of a web server:
In order to avoid re-querying (using find), one could try and keep between requests the cursor reference returned by find. The Cursor object is a complex object storing for example socket connections. How to store such an object to avoid re-querying on subsequent web requests? I am working in Node.js but any advice is helpful (regardless of the language: rails, C#, Java, PHP).
(I am using persistent sessions)
Facebook and Twitter's stream features are more complex than a simple query to a db. Systems like this tend to have two major backend components in their architecture, serving you data: slow and fast.
1) The first backend system is your database, accessed via a query to get a page of results from the stream (being someone's twitter feed or their fb feed). When you page to the bottom or click 'more results' it will just increment the page variable and query against the API for that page of your current stream.
2) The 2nd is a completely separate system that is sending realtime updates to your page via websockets or paging against an API call. This is the 'fast' part of your architecture. This is probably not coming from a database, but a queue somewhere. From this queue, handlers are sending your data to your page, which is a subscriber.
Systems are designed like this because, to scale enormously, you can't depend on your db being updated in real time. It's done in big batches. So, you run a very small subset of that data through the fast part of your architecture, understanding that the way the user gets it from the 'fast' backend may not look exactly how it will eventually look in the 'slow' backend, but it's close enough.
So... moral of the story:
You don't want to persist your db cursor. You want to think 1) do I need updates to be realtime 2) and if so, how can I architect my system so that a first call gets me most of my data and a 2nd call/mechanism can keep it up to date.

How to implement load balancing in simple REST api?

I have simple REST api written in CakePHP (php on apache). Basically it has just one endpoint, let's say /api/something/?format=json. Calling this endpoint doesn't read anything from DB, but internally it's fetching and parsing some external website and returns parsed data to the user in json format. The problem is that fetching and parsing data from external web page may last quite long and therefore I need some load balancing mechanizm which will distribute api calls among several servers.
I have never done any load balancing so I even don't know where to look for info - I am looking for the simplest solution.
Is it a resource that has to be fetched live? Because you could cache the processed data for a certain amount of time.
If it has to be live, doing it in a distributed way is probably not going to solve your problem. (except when you're getting back a dataset that is very large)
http://en.wikipedia.org/wiki/Load_balancing_(computing)
Its pretty late but I guess This is what you need ! Just get the hardware to do all the good stuff !

Categories