PHP Web Service optimisations and testing methods

PHP Web Service optimisations and testing methods - php

I'm working on a web service in PHP which accesses an MSSQL database and have a few questions about handling large amounts of requests.
I don't actually know what constitutes 'high traffic' and I don't know if my service will ever experience 'high traffic' but would optimisations in this area be largely attributed to the server processing speed and database access speed?
Currently when a request is sent to the server I do the following:
Open database connection
Process Request
Return data
Is there anyway I can 'cache' this database connection across multiple requests? As long as each request was processed simultaneously the database will remain valid.
Can I store user session id and limit the amount of requests per hour from a particular session?
How can I create 'dummy' clients to send requests to the web server? I guess I could just spam send requests in a for loop or something? Better methods?
Thanks for any advice

You never know when high traffic occurs. High traffic might result from your search engine ranking, a blog writing a post of your web service or from any other unforseen random event. You better prepare yourself to scale up. By scaling up, i don't primarily mean adding more processing power, but firstly optimizing your code. Common performance problems are:
unoptimized SQL queries (do you really need all the data you actually fetch?)
too many SQL queries (try to never execute queries in a loop)
unoptimized databases (check your indexing)
transaction safety (are your transactions fast? keep in mind that all incoming requests need to be synchronized when calling database transactions. If you have many requests, this can easily lead to a slow service.)
unnecessary database calls (if your access is read only, try to cache the information)
unnecessary data in your frontend (does the user really need all the data you provide? does your service provide more data than your frontend uses?)
Of course you can cache. You should indeed cache for read-only data that does not change upon every request. There is a useful blogpost on PHP caching techniques. You might also want to consider the caching package of the framework of your choice or use a standalone php caching library.
You can limit the service usage, but i would not recommend to do this by session id, ip address, etc. It is very easy to renew these and then your protection fails. If you have authenticated users, then you can limit the requests on a per-account-basis like Google does (using an API key for all their publicly available services per user)
To do HTTP load and performance testing you might want to consider a tool like Siege, which exactly does what you expect.
I hope to have answered all your questions.

Related

Update multiple MVC's Views on-Model-change with PHP

What approach, mechanisms (& probably code) one should apply to fully implement Model-to-Views data update (transfer) on Model-State-Change event with pure PHP?
If I'm not mistaken, MVC pattern states an implicit requirement for data to be sent from Model layer to all active Views, specifying that "View is updated on Model change". (otherwise, it doesn't make any sense, as users, working with same source would see its data non-runtime and absolutely disconnected from reality)
But PHP is a scripting PL, so it's limited to "connection threads" via processes & it's lifetime is limited to request-response cycle (as tereško kindly noted).
Thus, one has to solve couple issues:
Client must have a live tunnel connection to server (Server Sent Events),
Server must be able to push data to client (flush(), ob_flush()),
Model-State-Change event must be raised & related data packed for transfer,
(?) Data must be sent to all active clients (connected to same exact resource/URL) together, not just one currently working with it's own processes & instance of ModelClass.php file...
UPDATE 1: So, it seems that "simultaneous" interaction with multiple users with PHP involves implementation of WEB Server over sockets of some sort, independent of NGINX and others.... Making its core non-blocking I/O, storing connections & "simply" looping over connections, serving data....
Thus, if I'm not mistaken the easiest way is, still, to go and get some ready solution like Ratchet, be it a 'concurrency framework' or WEB server on sockets...
Too much overhead for a couple of messages a day, though...
AJAX short polling seems to be quite a solution for this dilemma....
Is simultaneous updating multiple clients easier with some different backend than PHP, I wonder?.. Look at C# - it's event-based, not limited to "connection threads" and to query-reply life cycle, if I remember correctly... But it's still WEB (over same HTTP?)...

Laravel rate limiting API authenticated with API Token

I am building a restful API for users of my Laravel application to retrieve their data.
The current plan is that they can generate an API Token within the application to then authenticate their API requests. I do not know from where they will be making the requests.
The main reason I want to implement rate limiting is to reduce the impact of accidental/intentional DDOS, as well as part of the users current subscription package (necessary). Because of the latter, different users may have different rates.
Laravel already provides a rate limiter built in, including access to dynamic user limits specified in the User table.
I'm wondering though how the session is handled. From what I can see the Laravel TokenGuard class does not store the the user between requests. Therefore the user is being retrieved between every request, even to retrieve the rate limit. This seems to defeat the point of the rate limiter if we are still making database queries each time.
What is the appropriate way to handle this?
If I write my own authentication middleware, and store the user in the session, would that work? Do requests sent from another server (not a browser) even handle sessions?
Thanks.

Every time anyone accesses your site, you are spinning up an entire Laravel instance which is already putting stress on your server. DDOS doesn't depend on distressing your DB only. If someone is determined to DDOS you, you are going to notice! All you can do is mitigate the problem, so don't worry too much that each request has an associated DB call.
You could have a local session, but in the long run this is a bad design decision, since it introduces state to your server, which will make scaling in the future much harder. (https://12factor.net/ for more info on that.) This is why Laravel uses the user stored in the DB instead.
Unless you are doing something pretty special, it's generally safe to assume that Laravel is using an adequate solution. They do frameworks so that you can worry about business logic!
Finally, there are many websites out there. The chances are that by the time you're big enough to attract attention of people trying to DDOS you (remember it takes resources, and, therefore, money) you'll probably be using a much more sophisticated system.

If a request with some kind of token reaches your application, you should not need any kind of session. As you've assumed: a session is usually handled through cookies, but raw HTTP calls (like cURL does it) usually don't use them.
Don't overestimate the cost of getting the current user from the database - if your application performs some more actions, these additional actions will make the difference! Getting one entity from the database is fairly cheap, compared to everything else, and to check for the proper permissions and rate limits, this is obviously neccessary.
Everything else looks like you're looking for something like Laravel Passport (see https://laravel.com/docs/5.7/passport). Additional tools like the Throttle package (see https://github.com/GrahamCampbell/Laravel-Throttle) will help you to enable the rate limiting for your routes

RESTful API in PHP - optimising successive requests?

I'm working (for the first time) on developing a PHP application driven by a PHP RESTful API (probably using peej/Tonic). Coming from an application with direct access which might make 20 different database calls during the course of a page load, I am trying to reconcile the fact that 20 API calls = 20x handshakes (which can be improved by Guzzle persistent connections) but also 20x connections to the database.
I believe that with better programming and planning, I can get my required API calls down to 4-5 per page. At this point:
a) Is it not worth considering the latency of 5x database connections + 5x handshakes per page load on account of all the other available optimisations?
b) Is there an existing method by which this can be mitigated that I've thus far failed to find?
c) I believe it violates the principles of RESTful programming but if I had a single API method which itself gathered information from other API endpoints (for instance, GET suppliers WHERE x=y then GET products for supplier), is there a documented method for internal API interaction (particularly within peej/Tonic or other frameworks).
Thank you all for your wisdom in advance.

Remember that the client should be "making 'a request' of the server," which is obliged to fulfill "that request." The server might "execute 20 different database queries" to prepare its response, and the client need not know nor care.
The client's point-of-view becomes, "I care what you tell me, not how you did it."
If you did wish to send query-responses directly to the client, for the client to "do the dirty-work" with these data, then you could still design your server request so that the server did many queries, all at once, and sent all of the result-sets back ... in just one exchange.
Your first priority should be to effectively minimize the number of exchanges that take place. The amount of data returned (within reason) is secondary.
Also consider that, "when the server does it, the work is naturally synchronized." When the client issues multiple asynchronous requests, those requests are, well, "asynchronous." Consider which strategy will be easier for you to debug.
If the server is given "a request to do," it can validate the request (thus checking for client bugs), and perform any number of database operations, perhaps in a TRANSACTION. This strategy, which puts the server into a very active role, is often much less complex than an interaction that is driven by the client with the server taking a passive role.

Should I use WebSockets in a social network app, or will PHP/AJAX suffice?

I would like your opinion on a project. The way I started myself and slowly presenting many gaps and problems either now or in the future they will create big issue.
The system will have a notification system, friends system, message system (private), and in general such systems. All these I have set up with:
jQuery, PHP, mysqli round-trips to avoid wordiness. I am getting at what the title says.
If all these do a simple PHP code and post and get methods for 3-4 online users will be amazing! The thing is when I have several users what can I do to make better use of the resources of the server? So I started looking more and and found like this socket.io
I just want someone to tell me who knows more what would be best to look for. Think how the update notification system work now. jQuery with post and repeated every 3-5 seconds, but it is by no means right.

If your goal is to set up a highly scalable notification service, then probably not.
That's not a strict no, because there are other factors than speed to consider, but when it comes to speed, read on.
WebSockets does give the user a consistently open, bi-directional connection that is, by its very nature, very fast. Also, the client doesn't need to request new information; it is sent when either party deems it appropriate to send.
However, the time savings that the connection itself gives is negligible in terms of the costs to generate the content. How many database calls do you make to check for new notifications? How much structured data do you generate to let the client know to change the notification icon? How many times do you read data from disk, or from across the network?
These same costs do not go away when using any WebSocket server; it just makes one mitigation technique more obvious: Keep the user's notification state in memory and update it as notifications change to prevent costly trips to disk, to the database, and across the server's local network.
Known proven techniques to mitigate the time costs of serving quickly changing web content:
Reverse proxy (Varnish-Cache)
Sits on port 80 and acts as a very thin web server. If a request is for something that isn't in the proxy's in-RAM cache, it sends the request on down to a "real" web server. This is especially useful for serving content that very rarely changes, such as your images and scripts, and has edge-side includes for content that mostly remains the same but has some small element that can't be cached... For instance, on an e-commerce site, a product's description, image, etc., may all be cached, but the HTML that shows the contents of a user's cart can't, so is an ideal candidate for an edge-side include.
This will help by greatly reducing the load on your system, since there will be far fewer requests that use disk IO, which is a resource far more limited than memory IO. (A hard drive can't seek for a database resource at the same time it's seeking for a cat jpeg.)
In Memory Key-Value Storage (Memcached)
This will probably give the most bang for your buck, in terms of creating a scalable notification system.
There are other in-memory key-value storage systems out there, but this one has support built right into PHP, not just once, but twice! (In the grand tradition of PHP core development, rather than fixing a broken implementation, they decided to consider the broken version deprecated without actually marking that system as deprecated and throwing the appropriate warnings, etc., that would get people to stop using the broken system. mysql_ v. mysqli_, I'm looking at you...) (Use the memcached version, not memcache.)
Anyways, it's simple: When you make a frequent database, filesystem, or network call, store the results in Memcached. When you update a record, file, or push data across the network, and that data is used in results stored in Memcached, update Memcached.
Then, when you need data, check Memcached first. If it's not there, then make the long, costly trip to disk, to the database, or across the network.
Keep in mind that Memcached is not a persistent datastore... That is, if you reboot the server, Memcached comes back up completely empty. You still need a persistent datastore, so still use your database, files, and network. Also, Memcached is specifically designed to be a volatile storage, serving only the most accessed and most updated data quickly. If the data gets old, it could be erased to make room for newer data. After all, RAM is fast, but it's not nearly as cheap as disk space, so this is a good tradeoff.
Also, no key-value storage systems are relational databases. There are reasons for relational databases. You do not want to write your own ACID guarantee wrapper around a key-value store. You do not want to enforce referential integrity on a key-value store. A fancy name for a key-value store is a No-SQL database. Keep that in mind: You might get horizontal scalability from the likes of Cassandra, and you might get blazing speed from the likes of Memcached, but you don't get SQL and all the many, many, many decades of evolution that RDBMSs have had.
And, finally:
Don't mix languages
If, after implementing a reverse proxy and an in-memory cache you still want to implement a WebSocket server, then more power to you. Just keep in mind the implications of which server you choose.
If you want to use Socket.io with Node.js, write your entire application in Javascript. Otherwise, choose a WebSocket server that is written in the same language as the rest of your system.
Example of a 1 language solution:
<?php // ~/projects/MySocialNetwork/core/users/myuser.php
class MyUser {
public function getNotificationCount() {
// Note: Don't got to the DB first, if we can help it.
if ($notifications = $memcachedWrapper->getNotificationCount($this->userId) !== null) // 0 is false-ish. Explicitly check for no result.
return $notifications;
$userModel = new MyUserModel($this->userId);
return $userModel->getNotificationCount();
}
}
...
<?php // ~/projects/WebSocketServerForMySocialNetwork/eventhandlers.php
function websocketTickCallback() {
foreach ($connectedUsers as $user) {
if ($user->getHasChangedNotifications()) {
$notificationCount = $user->getNotificationCount();
$contents = json_encode(array('Notification Count' => $notificationCount));
$message = new WebsocketResponse($user, $contents);
$message->send();
$user->resetHasChangedNotifications();
}
}
}
If we were using socket.io, we would have to write our MyUser class twice, once in PHP and once in Javascript. Who wants to bet that the classes will implement the same logic in the same ways in both languages? What if two developers are working on the different implementations of the classes? What if a bugfix gets applied to the PHP code, but nobody remembers the Javascript?

How to see the bandwidth usage of my Flash application?

I'm developing an online sudoku game, with ActionScript 3.
I made the game and asked people to test it, it works, but the website goes down constantly. I'm using 000webhost, and I'm suspecting it is a bandwidth usage precaution.
My application updates the current puzzle, by parsing a JSON string every 2 seconds. And of course when players enter a number, it sends a $_GET request to update the mysql database. Do you think this causes a lot of data traffic?
How can I see the bandwidth usage value?
And how should I decrease the data traffic between Flash and mysql (or php, really).
Thanks !

There isn't enough information for a straight answer, and if there were, it'd probably take more time to figure out. But here's some directions you could look into.
Bandwidth may or may not be an issue. There are many things that could happen, you may very well put too much strain on the HTTP server, run out of worker threads, have your MySQL tables lock up most of the time, etc.
What you're doing indeed sounds like it's putting a lot of strain on the server. Monitoring this client side could be inefficient, you need to look at some server-side values, but you generally don't have access to those unless you have at least a VPS.
Transmitting data as JSONs is easier to implement and debug, but a more efficient way to send data (binary, instead of strings) is AMF: http://en.wikipedia.org/wiki/Action_Message_Format
One PHP implementation for the server side part is AMFPHP: http://www.silexlabs.org/amfphp/
Alternatively, your example is a good use case for the remote shared objects in Flash Media Server (or similar products). A remote shared object is exactly what you're doing with MySQL: it creates a common memory space where you can store whatever data and it keeps that data synchronised with all the clients. Automagically. :)
You can start from here: http://livedocs.adobe.com/flashmediaserver/3.0/hpdocs/help.html?content=00000100.html

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.