What is the difference between revolutions and iterations in phpbench? - php

I already read the documentation, but when testing, I'm still not able to understand well the difference between them.
For example, with this simple file:
<?php
class StackOverflowBench
{
public function benchNothing()
{
}
}
When I set 1000 revolutions, and only one iteration, here is my result:
subject
set
revs
its
mem_peak
best
mean
mode
worst
stdev
rstdev
diff
benchNothing
0
10000
1
2,032,328b
10.052μs
10.052μs
10.052μs
10.052μs
0.000μs
0.00%
1.00x
the best, mean, mode and worst are always the same, which means they are based on the only iteration I made.
When I run it with 10 revolutions and still 1 iteration, I have this:
subject
set
revs
its
mem_peak
best
mean
mode
worst
stdev
rstdev
diff
benchNothing
0
10
1
2,032,328b
10.200μs
10.200μs
10.200μs
10.200μs
0.000μs
0.00%
1.00x
which seems to mean the times calculated are not a sum of all the revolutions, but something like an average for each iteration.
If I wanted to measure the best and worst execution time of each time the method is executed, I'd try 1000 iterations and only 1 revolution each, but it takes waay to much time. I launched it with 100 iterations of 1 revolution, here's the result :
subject
set
revs
its
mem_peak
best
mean
mode
worst
stdev
rstdev
diff
benchNothing
0
1
100
2,032,328b
20.000μs
25.920μs
25.196μs
79.000μs
5.567μs
21.48%
1.00x
This time, the time seems to be at least twice as long, and I'm wondering what I didn't understand well. I may be using these informations badly (I know my last example is a wrong one).
Is it necessary to measure the best and worst of each revolution, like I want to do ?
What are the interests of iterations ?

Revolution vs iteration
Let's take your example class:
class StackOverflowBench
{
public function benchNothing()
{
}
}
If you have 100 revolutions and 3 iterations, this is the pseudo code that will be run:
// Iterations
for($i = 0; $i < 3; $i++){
// Reset memory stats code here...
// Start timer for iteration...
// Create instance
$obj = new StackOverflowBench();
// Revolutions
for($j = 0; $j < 100; $j++){
$obj->benchNothing();
}
// Stop timer...
// Call `memory_get_usage` to get memory stats
}
What does the report mean?
Almost all of the calculated stats (mem_peak, best, mean, mode, worst, stdev and rstdev) in the output are based on individual iterations and are documented here.
The diff stat is the weird one and document here and mentioned elsewhere as:
the percentage difference from the lowest measurement
When you run a test, you can specify what column to report the difference on. So if you diff_column on run time, if iteration #1 takes 10 seconds and #2 takes 20 seconds, the diff for #1 would be 1.00 (since it is the lowest) and #2 would be 2.00 since it took twice as long. (Actually, I'm not 100% sure that is the exact usage of that column
Measuring revolution vs iteration
Some code needs to be run thousands or millions of times in a task/request/action/etc. which is why revolutions exist. If I run a simple but critical block of code just once, a report might tell me it takes 0.000 seconds which isn't helpful. That's why some blocks of code need to have their revolution count kicked up to get a rough idea, based on possible real-world usage, how they perform under load. Array sorting algorithms are great examples of a tightly-coupled call that will happen a lot in a single request.
Other code might only do a single thing, such as making an API or database request, and for those blocks of code we need to know how much system resources will they take up as a whole. So if I make a database call and consume 2MB of, and I'm expecting to have 1,000 concurrent users, those calls could take up 2GB of memory. (I'm simplifying but you should get the gist.)
If you look at my pseudo code above, you'll see that setting up each iteration is more expensive than each revolution. The revolution basically just invokes a method, but the iteration calculates memory and does instantiation-related work.
So, to your second-to-last question:
Is it necessary to measure the best and worst of each revolution, like I want to do?
Probably not, although there are tools out there that will tell you. You could for instance, find out how much memory was used before a method and after to determine if your code is sub-optimal, but you can also do that with PHPBench by making a 1 iteration, 1 revolution run and looking for methods with high memory.
I'd further say that if you have code that has great variance per revolution, it is almost 100% related to IO factors and not code, or it is related to the test dataset, and most probably size.
You should hopefully know all of your IO-related paths, so benchmarking the various problems related to those paths really isn't a factor of this tool.
For dataset-related problems, however, that is interesting and is a case where you'd want to know each run potentially. There, too, however, the measurements are there to know either how to fix/change your code, or to know that your code runs with a certain time complexity.

Related

Code Igniter file cache – more misses than expected

We're currently using Code Igniter's file cache to optimize a heavy DB query which is called often in our multi-user environment, but doesn't change often. The cache is set to expire every 60 seconds, so that would suggest the query would be called about 60 times an hour. In the DB monitor, we're seeing the query invoked 340 ± 150 times per hour, or almost six times as frequently as expected.
The code is nothing special, all in one function:
// Cache identifier
$cache = 'cache-identifier';
// Try to pull from the cache first
if ($working = $this->cache->file->get($cache))
return $working;
// Cache didn't work out, query directly
$sql = '<<heavy-DB-query>>';
$working = $this->db->query($sql)->result_array();
// Save the cache (note essentially redundant check: $working is 99.95% non-null)
if ($working)
$this->cache->file->save($cache, $working, 60);
return $working;
We do see a significant reduction compared to non-caching but would like to optimize more. Any ideas why would it be missing so often?
Some more details:
Prior to caching, we were getting the DB query invoked about 3,000 calls/hour (≈ 50 calls/minute). So 340 calls/hour (≈ 5½ calls/minute) is a significant improvement. We tried outputting to a log file whenever there's a miss, but the act of doing that seems to change the dynamics and the statistics aren't the same.
Testing in isolation, single user, from browser to server and back, a miss takes about 1.0 s, and a hit about 0.6 s. The query itself is about 0.1 s, but with overhead, server to DB and back is about 0.5 s.
I think what might be going on is that side effects from other activities are causing the users to synchronize, so that we get the queries bunched up and not random. The first one has a miss, makes the query, same with second, third and so on, until the first gets the results back. Then the remainder for that minute get hits. Given the times involved, seeing 5½ calls a minute would fit this hypothesis more-or-less. But how to test?
Even more:
Turns out the server access logs had the answer. They were indeed showing groups of multiple queries with the same time stamp. Turns out the users often have multiple windows open in the same browser, and in this scenario, the queries do tend to synchronize. I've added a random element to the timer to break up this synchronization. So far, seems to be better. Now that I know what's going on, I'll be happy with ≈2 calls/minute!

PHP get amount of MySQL queries per hour

Short:
Is there a way to get the amount of queries that were executed within a certain timespan (via PHP) in an efficient way?
Full:
I'm currently running an API for a frontend web application that will be used by a great amount of users.
I use my own custom framework that uses models to do all the data magic and they execute mostly INSERTs and SELECTs. One function of a model can execute 5 to 10 queries on a request and another function can maybe execute 50 or more per request.
Currently, I don't have a way to check if I'm "killing" my server by executing (for example) 500 queries every second.
I also don't want to have surprises when the amount of users increases to 200, 500, 1000, .. within the first week and maybe 10.000 by the end of the month.
I want to pull some sort of statistics, per hour, so that I have an idea about an average and that I can maybe work on performance and efficiency before everything fails. Merge some queries into one "bigger" one or stuff like that.
Posts I've read suggested to just keep a counter within my code, but that would require more queries, just to have a number. The preferred way would be to add a selector within my hourly statistics script that returns me the amount of queries that have been executed for the x-amount of processed requests.
To conclude.
Are there any other options to keep track of this amount?
Extra. Should I be worried and concerned about the amount of queries? They are all small ones, just for fast execution without bottlenecks or heavy calculations and I'm currently quite impressed by how blazingly fast everything is running!
Extra extra. It's on our own VPS server, so I have full access and I'm not limited to "basic" functions or commands or anything like that.
Short Answer: Use the slowlog.
Full Answer:
At the start and end of the time period, perform
SELECT VARIABLE_VALUE AS Questions
FROM information_schema.GLOBAL_STATUS
WHERE VARIABLE_NAME = 'Questions';
Then take the difference.
If the timing is not precise, also get ... WHERE VARIABLE_NAME = 'Uptime' in order to get the time (to the second)
But the problem... 500 very fast queries may not be as problematic as 5 very slow and complex queries. I suggest that elapsed time might be a better metric for deciding whether to kill someone.
And... Killing the process may lead to a puzzling situation wherein the naughty statement remains in "Killing" State for a long time. (See SHOW PROCESSLIST.) The reason why this may happen is that the statement needs to be undone to preserve the integrity of the data. An example is a single UPDATE statement that modifies all rows of a million-row table.
If you do a Kill in such a situation, it is probably best to let it finish.
In a different direction, if you have, say, a one-row UPDATE that does not use an index, but needs a table scan, then the query will take a long time and possible be more burden on the system than "500 queries". The 'cure' is likely to be adding an INDEX.
What to do about all this? Use the slowlog. Set long_query_time to some small value. The default is 10 (seconds); this is almost useless. Change it to 1 or even something smaller. Then keep an eye on the slowlog. I find it to be the best way to watch out for the system getting out of hand and to tell you what to work on fixing. More discussion: http://mysql.rjweb.org/doc.php/mysql_analysis#slow_queries_and_slowlog
Note that the best metric in the slowlog is neither the number of times a query is run, nor how long it runs, but the product of the two. This is the default for pt-query-digest. For mysqlslowdump, adding -s t gets the results sorted in that order.

How to identify the bottlenecks with Xhprof?

I have an issue with a very slow API call and want to find out, what it caused by, using Xhprof: the default GUI and the callgraph. How should this data be analyzed?
What is the approach to find the places in the code, that should be optimized, and especially the most expensive bottlenecks?
Of all those columns, focus on the one called "IWall%", column 5.
Notice that send, doRequest, read, and fgets each have 72% inclusive wall-clock time.
What that means is if you took 100 stack samples, each of those routines would find itself on 72 of them, give or take, and I suspect they would appear together.
(Your graph should show that too.)
So since the whole thing takes 23 seconds, that means about 17 seconds are spent simply reading.
The only way you can reduce that 17 seconds is if you can find that some of the reading is unnecessary. Can you?
What about the remaining 28% (6 seconds)?
First, is it worth it?
Even if you could reduce that to zero (17 seconds total, which you can't), the speedup factor would 1/(1-0.28) = 1.39, or 39%.
If you could reduce it by half (20 seconds total), it would be 1/(1-0.14) = 1.16, or 16%.
20 seconds versus 23, it's up to you to decide if it's worth the trouble.
If you decide it is, I recommend the random pausing method, because it doesn't flood you with noise.
It gets right to the heart of the matter, not only telling you which routines, but which lines of code, and why they are being executed.
(The why is most important, because you can't replace it if it's absolutely necessary.
With profilers, you tend to assume it is necessary, because you have no way to tell otherwise.)
Since you are looking for something taking about 14% of the time, you're going to have to examine 2/0.14 = 14 samples, on average, to see it twice, and that will tell you what it is.
Keep in mind that about 14 * 0.72 = 10 of those samples will land in fgets (and all its callers), so you can either ignore those or use them to make sure all that I/O is really necessary.
(For example, is it just possible that you're reading things twice, for some obscure reason like it was easier to do that way? I've seen that.)

Truncate last X digits of number without division by 10eX

I'm making a blocking algorithm, and I just realised that adding a timeout to such algorithm is not so easy if it should remain precise.
Adding timeout means, that the blocking algorithm should abort after X ms if not earlier. Now I seem to have two options:
Iterating time (has mistake, but is fast)
Check blocking condition
Iterate time_elapsed by 1 (which means 1e-6 sec with use of usleep)
Compare time_elapsed with timeout. (here is the problem I will talk about)
usleep(1)
Getting system time every iteration (slow, but precise)
I know how to do this, please do not post any answers about that.
Compating timeout with time_elapsed
And here is what bothers me. The timeout will be in milliseconds (10e-3) while usleep sleeps for 10e-6 seconds. So my time_elapsed will be 1000 times more precise than timeout. I want to truncate last three digits of time_elapsed (operation equal to floor($time_elapsed/1000) without dividing it. Division algorithm is too slow.
Summary
I want to make my variable 1000 times smaller without dividing it by 1000. I want just get rid of the data. In binary I'd use bit-shift operator, but have no idea how to apply it on decimal system.
Code sample:
Sometimes, when people on SO cannot answer the theoretical question, they really hunger for the code. Here it is:
floor($time_elapsed/1000);
I want to replace this code with something much faster. Please note that though the question itself is full of timeouts, the question title is only about truncating that data. Other users may find the solution useful for other purposes than timing.
Maybe this will help Php number format. Though this does cause rounding, if that is unacceptable then I don't think its possible because PHP is loosely typed that you cant define numbers with a particular level of precision.
try this:
(int)($time_elapsed*0.001)
this should be a lot faster

Why count is bad than $count

I was just reviewing the answers to different questions to learn more. I saw an answer which says that it is bad practice in php to write
for($i=0;$i<count($array);$i++)
It says that calling the count function in the loop reduces the speed of the code. The discussion in the comments on this question was not clear. I want to know why it is not good practice. What should be the alternative way of doing this?
You should do this instead:
$count = count($array);
for($i=0;$i<$count;$i++)...
The reason for doing this is because if you put the count($array) inside the for loop then the count function would have to be called for every iteration which slows down speed.
However, if you put the count into a variable, it is a static number that won't have to be recalculated every time.
For every iteration, PHP is checking that part of the loop (the condition) to see if it should keep looping, and every time it checks it, it is calculating the length of the array.
An easy way to cache that value is...
for($i=0,$count=count($array);$i<$count;$i++) { ... }
It is probably not necessary in small loops, but could make a huge difference when iterating over thousands of items, and dependent on what function call is in the condition (and how it determines its return value).
This is also why you should use foreach() { ... } if you can, it uses an iterator on a copy of the array to loop over the set and you don't have to worry about caching the condition of the loop.
I heard of a database in a doctor's surgery that made exactly this mistake with a piece of software. It was tested with about 100 records, all worked fine. Within a few months, it was dealing with millions of records and was totally unusable, taking minutes to load results. The code was replaced as per the answers above, and it worked perfectly.
To think about it another way, a fairly powerful dedicated server that's not doing much else will take about 1 nanosecond to do count($array). If you had 100 for loops, each counting 1,000 rows then that's only 0.0001 of a second.
However, that's 100,000 calculations for EVERY page load. Scale that up to 1,000,000 users (and who doesn't want to have 1 million users?)... doing a 10 page loads and now you have 1,000,000,000,000 (1 trillion) calculations. That's going to put a lot of load on the server. It's a 1,000 seconds (about 16.5 minutes) that your processor spends running that code.
Now increase the time it takes the machine to process the code, the number of items in the arrays, and the number of for loops in the code... you're talking of literally many trillions of processes and many hours of processing time that can be avoided by just storing the result in a variable first.
It is not good practice because as written, count($array) will be called each time through the loop. Assuming you won't be changing the size of the array within the loop (which itself would be a horrible idea), this function will always return the same value, and calling it repeatedly is redundant.
For short loops, the difference probably won't be noticeable, but it's still best to call the function once and use the computed value in the loop.

Categories