PHP (poor man's profiler) incorrect hrtime() difference

PHP (poor man's profiler) incorrect hrtime() difference - php

I'm implementing so-called poor man's profiling in PHP and doing it pretty straightforward way as you would expect:
$start = hrtime(true);
// API request is more than 1 second long
$response = $this->getClient($cfg)->request($method, $url, ['query' => $query]);
// Exec time calculated in milliseconds
$end = round((hrtime(true) - $start) / 1E+6, 3);
// Result: $end = 2.642
Calculated execution time is always around very minor values like few ms which is unbelievably untrue, bec. endpoint has strict timeout of 1s. hrtime() before and after API request reports very minor difference whether API request itself is not that fast. microtime() gives similar results. cURL in turn returns valid response time.
I can't understand what I'm doing wrong. Interesting fact is that script total exec time seems to be valid, but profiling points like these are strangely small. What is my problem?
I'm using
symfony/http-client: ^5.4
Docker php:7.4-fpm image with nginx
x64

The reason for such behavior as mentioned by #AlexHowansky in comments is indeed the HTTP client I'm using - symfony/http-client. I completely missed that point from very beginning.
As docs state:
Responses are always asynchronous, so that the call to the method returns immediately instead of waiting to receive the response.
... unassigned responses will fallback to synchronous requests.
I switched API client to bare cURL requests what took a bit more lines, but wasn't too tricky to do.

Related

Is memcached supposed to take this long?

I've compared these two pieces of code:
Test 1:
$time = microtime(true);
$memcached = new Memcached();
$memcached->addServer('localhost', 11211);
for($i=1;$i<=1000;$i++){
$result = $memcached->get('test');
}
echo (microtime(true) - $time)*1000;
Resulting time: 50.509929656982
Test 2:
$time = microtime(true);
$memcached = new Memcached();
$memcached->addServer('localhost', 11211);
for($i=1;$i<=1000;$i++){
$result = 'just me';
}
echo (microtime(true) - $time)*1000;
Resulting time: 0.3209114074707
Is memcached supposed to take this long?

You did kind of a lot of confusing math there. I don't think your question is clear at all.
It looks like you got 50.509929656982ms after multiplying the original answer by 1,000. So I'm going to multiply that back to your original answer of ~50,510µs. Now, that was for 1,000 requests, so the average request returned in ~50µs.
Now, if you actually want to grab 1,000 items, you'll it in a single multiget request, which will reduce that time per-item average considerably.
If you're asking if the time to form and transmit a network request and get the network response, parse it and return it is expected to be slower than a NOOP assignment in a tight loop then, yeah.
If you're asking if 50µs is fast or slow... that's really up to you.

On a typical memcached setup I would expect almost every operation to finish in 1ms or less. I would recommend checking to see what your network latency is and then profiling the client to see how much time the client spends actually processing the operations. If you can rule out those being slow then you might have some kind of server side issue.

Memcached bug in PHP - binary protocol

I came across a bug using Memcached in PHP. Here's my piece of code:
<?php
$mc = new \Memcached();
$mc->setOption(\Memcached::OPT_BINARY_PROTOCOL, true);
$mc->addServer("127.0.0.1", 11211);
$mc->touch("key", time() + 600);
$touchResult = $mc->getResultCode();
$mc->set("key", 1, time() + 600);
$setResult = $mc->getResultCode();
echo "<pre>";
echo "Touch result: $touchResult\n";
echo "Set result: $setResult\n";
echo "</pre>";
When you run this for the first time, this is the output:
Touch result: 16
Set result: 0
And for the second time forth:
Touch result: 0
Set result: 5
Correct me if I'm wrong but this is a bug right? Does anyone know a workaround for this?
Here are the versions I use:
Ubuntu 12.04 64bit
PHP 5.3.14
memcached 2.1.0 (PECL module)
libmemcached 1.0.8
Memcached sever 1.4.13
PS. If you wonder what the result codes mean, here they are:
0 RES_SUCCESS
5 RES_WRITE_FAILURE
16 RES_NOTFOUND
[UPDATE]
I played a little more with the code and found something even more interesting. This bug happens regardless of the key that touch and set are working on. As long as the touch operation returns 0 (which means it was successful) the set operation will fail.
[UPDATE]
I managed to produce some other errors as well. e.g. acquiring some key from server and then adding some other will also lead to nasty problems (RES_END code). I believe all these problems are somehow related to binary protocol. It seems to me as if binary protocol's implementation is hardly near stable. Operations which can work without binary protocol will do just fine but once the protocol is set to binary, they will result in blocking problems.

All right.
In first time, you touch not existed key - result is RES_NOTFOUND. When you do set - you write value success - RES_SUCCESS.
In next time you touch existed key (you set it in first linch) and get result of operation RES_SUCCESS, next you try set value for existed key - result false. All right.
If you want change existing value you must use Memcached::replace() method instead of "set"

How can I get the real CPU time used in PHP?

How can I calculate the CPU time actually used by my php script?
Note this is NOT what I'm looking for:
<?php
$begin_time=microtime(true);
//..
//end of the script:
$total_time=microtime(true)-$begin_time;
because that would give me the time elapsed. That may include a lot of time used by unrelated processes running at the same time, as well as time spent waiting for i/o.
I've seen there is getrusage(), but a user comment in the documentation page says:
getrusage() reports kernel counters that are updated only once application loses context and a switch to kernel space happens. For example on modern Linux server kernels that would mean that getrusage() calls would return information rounded at 10ms, desktop kernels - at 1ms.
getrusage() isn't usable for micro-measurements at all - and getmicrotime(true) might be much more valuable resource.
so that doesn't seem to be an option, is it?
What alternatives do I have?

Define "your" I/O. Technically a move of memory to cpu registers is I/O. Are you saying you want to remove that time from your calculation? (I'm guessing no)
What I'm getting at is you are looking to profile your code in some way most likely. If you want to measure time not spent reading/writing to files or network sockets, just use microtime and put extra calls around portions doing I/O. Then you will also get an idea of how much time your I/O is taking. More likely you will find you have some loop taking more time than you expect.
When I profile like this I either use profiling tools in eclipse, or I use a time logger and do some kind of binary search-ish insertion of the time logging into the code. Usually I find some small area of code that is taking 85% of the measured time and do optimization there.
Also as a side note, don't let the perfect become the enemy of the practical. 90% of the time during your process your calls won't be interrupted by some other process and your microtime counts will be close enough.

You can use getrusage()
Full example:
<?php
function rutime($ru, $rus, $index){
return ($ru["ru_$index.tv_sec"]*1000 + intval($ru["ru_$index.tv_usec"]/1000))
- ($rus["ru_$index.tv_sec"]*1000 + intval($rus["ru_$index.tv_usec"]/1000));
}
$cpu_before = getrusage();
$ms = microtime(true) * 1000;
sleep(3);
$tab = [];
for($i = 0; $i < 500000; $i++) {
$tab[] = $i;
}
$cpu_after = getrusage();
echo "Took ".rutime($cpu_after, $cpu_before, "utime")." ms CPU usage" . PHP_EOL;
echo "Took ".((microtime(true) * 1000) - $ms)." ms total". PHP_EOL;
Source: https://helpdesk.nodehost.ca/en/article/how-to-calculate-real-cpu-usage-in-a-php-script-o3ceu8/

Gathering entropy in web apps to create (more) secure random numbers

after several days of research and discussion i came up with this method to gather entropy from visitors (u can see the history of my research here)
when a user visits i run this code:
$entropy=sha1(microtime().$pepper.$_SERVER['REMOTE_ADDR'].$_SERVER['REMOTE_PORT'].
$_SERVER['HTTP_USER_AGENT'].serialize($_POST).serialize($_GET).serialize($_COOKIE));
note: pepper is a per site/setup random string set by hand.
then i execute the following (My)SQL query:
$query="update `crypto` set `value`=sha1(concat(`value`, '$entropy')) where name='entropy'";
that means we combine the entropy of the visitor's request with the others' gathered already.
that's all.
then when we want to generate random numbers we combine the gathered entropy with the output:
$query="select `value` from `crypto` where `name`='entropy'";
//...
extract(unpack('Nrandom', pack('H*', sha1(mt_rand(0, 0x7FFFFFFF).$entropy.microtime()))));
note: the last line is a part of a modified version of the crypt_rand function of the phpseclib.
please tell me your opinion about the scheme and other ideas/info regarding entropy gathering/random number generation.
ps: i know about randomness sources like /dev/urandom.
this system is just an auxiliary system or (when we don't have (access to) these sources) a fallback scheme.

In the best scenario, your biggest danger is a local user disclosure of information exploit. In the worst scenario, the whole world can predict your data. Any user that has access to the same resources you do: the same log files, the same network devices, the same border gateway, or the same line that runs between you and your remote connections allows them to sniff your traffic by unwinding your random number generator.
How would they do it? Why, basic application of information theory and a bit of knowledge of cryptography, of course!
You don't have a wrong idea, though! Seeding your PRNG with real sources of randomness is generally quite useful to prevent the above attacks from happening. For example, this same level of attack can be exploited by someone that understands how /dev/random gets populated on a per-system basis if the system has low entropy or its sources of randomness are reproducible.
If you can sufficiently secure the processes that seed your pool of entropy (for example, by gathering data from multiple sources over secure lines), the likelihood that someone is able to listen in becomes smaller and smaller as you get closer and closer to the desirable cryptographic qualities of a one-time pad.
In other words, don't do this in PHP, using a single source of randomness fed into a single Mersenne twister. Do it properly, by reading from your best, system-specific alternative to /dev/random, seeding its entropy pool from as many secure, distinct sources of "true" randomness as possible. I understand you've stated that these sources of randomness are inaccessible, but this notion is strange when similar functions are afforded to all major operating systems. So, I suppose I find the concept of an "auxiliary system" in this context to be dubious.
This will still be vulnerable to an attack by a local user cognizant of your sources of entropy, but securing the machine and increasing the true entropy within /dev/random will make it far more difficult for them to do their dirty work short of a man-in-the-middle attack.
As for cases where /dev/random is indeed accessible, you can seed it fairly easily:
Look at what options exist on your system for using /dev/hw_random
Embrace rngd (or a good alternative) for defining your sources of randomness
Use rng-tools for inspecting and improving your randomness profile
And finally, if you need a good, strong source of randomness, consider investing in more specialized hardware.
Best of luck in securing your application.
PS: You may want to give questions like this a spin at Security.SE and Cryptography.SE in the future!

Use Random.Org
If you need truly random numbers, use random.org. These numbers are generated via atmospheric noise. Besides library for PHP, it also has a http interface which allows you to get truly random numbers by simple requests:
https://www.random.org/integers/?num=10&min=1&max=6&col=1&base=10&format=plain&rnd=new
This means that you can simply retrieve the real random numbers in PHP without any additional PECL exension on the server.
If you don't like other users to be able to "steal" your random numbers (as MrGomez' argues), just use https with a certificate checking. Here follows an example with https certificate checking:
$url = "https://www.random.org/integers/?num=10&min=1&max=6&col=1&base=10&format=plain&rnd=new";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
$response = curl_exec($ch);
if ($response === FALSE)
echo "http request failed: " . curl_error($ch);
else
echo $response;
curl_close($ch);
If you need more information on how to create https requests:
Make a HTTPS request through PHP and get response
http://unitstep.net/blog/2009/05/05/using-curl-in-php-to-access-https-ssltls-protected-sites/
More on security
Again, some might argue that if the attacker queries random.org at the same time as you, he might get the same numbers and predict.. I don't know if random.org would even work this way, but if you are really concerned, you may lessen the chance by fooling the attacker with dummy request which you throw out, or use only a certain part of the random numbers you get.
As MrGomez notes in his comment, this shall not be considered as an ultimate solution to security, but only as one of possible sources of entropy.
Performance
Of course, if you need a blitz latency then doing one random.org request per one client request might not be best idea... but what about just doing one bigger request to pre-cache the random numbers like every 5 minutes?

To come to the point, as far as i know there is no way to generate entrophy inside a PHP script, sorry for this non-answer. Even if you look at well etablished scripts like phppass, you will see, that their fallback system cannot do some magic.
The question is, whether you should try it anyway or not. Since you want to publish your system under GPL, you propably don't know in what scenario it will be used. In my opinion it's best then to require a random source, or to fail fast (die with an appropriate error message), so a developer who wants to use your system, knows immediately, that there is a problem.
To read from the random source, you could call the mcrypt_create_iv() function...
$randomBinaryString = mcrypt_create_iv($length, MCRYPT_DEV_URANDOM);
...this function reads from the random pool of the operating system. Since PHP 5.3 it does it on Windows servers as well, so you can leave it to PHP to handle the random source.

If you have access to /dev/urandom you can use this:
function getRandData($length = 1024) {
$randf = fopen('/dev/urandom', 'r');
$data = fread($randf, $length);
fclose($randf);
return $data;
}
UPDATE:
of course you should have some backup in case opening the device fails

should you have access to client side, you can enable mouse movement tracking - this is what true crypt is using for extra level of entropy.

as i have said before, my rand function is a modified version of phpseclib's crypt_random function.
u could see it in the link given on my first post. at least the author of the phpseclib cryptographic library confirmed it; not enough for ordinary apps? i don't speak of extreme/theoretical security, just speak about practical security to the extent really needed and at the same time 'easily'/'sufficiently low cost' available for almost all of the ordinary applications on the web.
phpseclib's crypt_random effectively and silently falls back to the mt_rand (which u should know is really weak) in the worst case (no openssl_random_pseudo_bytes or urandom available), but my function uses a much more secure scheme in such cases. it's just a fall back to a scheme that brute-forcing/predicting its output is much harder and (should be) in practice sufficient for all ordinary apps/sites. it uses possible (in practice very likely and hard to predict/circumvent) extra entropy that is gathered over time which quickly becomes almost impossible to know for outsiders. it adds this possible entropy to the mt_rand's output (and also to the output of other sources: urandom, openssl_random_pseudo_bytes, mcrypt_create_iv). if u are informed u should know, this entropy can be added but not subtracted. in the (almost surely really rare) worst case, that extra entropy would be 0 or some too tiny amount. in the mediocre case, which i think is almost all of the cases, it would be even more than practically necessary, i think. (i have had vast cryptography studies, so when i say i think, it is based on a much more informed and scientific analysis than ordinary programmers).
see the full code of my modified crypt_random:
function crypt_random($min = 0, $max = 0x7FFFFFFF)
{
if ($min == $max) {
return $min;
}
global $entropy;
if (function_exists('openssl_random_pseudo_bytes')) {
// openssl_random_pseudo_bytes() is slow on windows per the following:
// http://stackoverflow.com/questions/1940168/openssl-random-pseudo-bytes-is-slow-php
if ((PHP_OS & "\xDF\xDF\xDF") !== 'WIN') { // PHP_OS & "\xDF\xDF\xDF" == strtoupper(substr(PHP_OS, 0, 3)), but a lot faster
extract(unpack('Nrandom', pack('H*', sha1(openssl_random_pseudo_bytes(4).$entropy.microtime()))));
return abs($random) % ($max - $min) + $min;
}
}
// see http://en.wikipedia.org/wiki//dev/random
static $urandom = true;
if ($urandom === true) {
// Warning's will be output unles the error suppression operator is used. Errors such as
// "open_basedir restriction in effect", "Permission denied", "No such file or directory", etc.
$urandom = #fopen('/dev/urandom', 'rb');
}
if (!is_bool($urandom)) {
extract(unpack('Nrandom', pack('H*', sha1(fread($urandom, 4).$entropy.microtime()))));
// say $min = 0 and $max = 3. if we didn't do abs() then we could have stuff like this:
// -4 % 3 + 0 = -1, even though -1 < $min
return abs($random) % ($max - $min) + $min;
}
if(function_exists('mcrypt_create_iv') and version_compare(PHP_VERSION, '5.3.0', '>=')) {
#$tmp16=mcrypt_create_iv(4, MCRYPT_DEV_URANDOM);
if($tmp16!==false) {
extract(unpack('Nrandom', pack('H*', sha1($tmp16.$entropy.microtime()))));
return abs($random) % ($max - $min) + $min;
}
}
/* Prior to PHP 4.2.0, mt_srand() had to be called before mt_rand() could be called.
Prior to PHP 5.2.6, mt_rand()'s automatic seeding was subpar, as elaborated here:
http://www.suspekt.org/2008/08/17/mt_srand-and-not-so-random-numbers/
The seeding routine is pretty much ripped from PHP's own internal GENERATE_SEED() macro:
http://svn.php.net/viewvc/php/php-src/tags/php_5_3_2/ext/standard/php_rand.h?view=markup */
static $seeded;
if (!isset($seeded) and version_compare(PHP_VERSION, '5.2.5', '<=')) {
$seeded = true;
mt_srand(fmod(time() * getmypid(), 0x7FFFFFFF) ^ fmod(1000000 * lcg_value(), 0x7FFFFFFF));
}
extract(unpack('Nrandom', pack('H*', sha1(mt_rand(0, 0x7FFFFFFF).$entropy.microtime()))));
return abs($random) % ($max - $min) + $min;
}
$entropy contains my extra entropy which comes from all requests parameters' entropy combined till now + current request's parameters entropy + the entropy of a random string (*) set by hand at the installation time.
*: length: 22, composed of lower and uppercase letters + numbers (more than 128 bits of entropy)

Update 2: Code Review Warning to Everyone: Dont use The code in the original question. It's a security liability. If this code is online anywhere Remove it as it open the whole system, network and database to a malevolent user. Your not only exposing your code but all of your users data.
Do not ever Serialize user inputs. If in your code your already doing it, Stop your server and change your code. This is a great exemple of Not doing crypto by yourself.
Update 1: For real security you need to have UN-guessable randomess in your entropy. A suitable option to add entropy has your Question refer-to is to use the Delta of your script's execution time Not microtime() by itself . Because the Delta Rely on the load of your server. And so is a combination of the hardware environment, temperature, network load, power load, disk access, Cpu usage and voltage fluctuation which together are unpredictable.
Using Time(), timestamp or microtime is a flaw in your implementation.
Script execution Delta Exemple code coming:
#martinstoeckli stated correctly that a Suitable Random generation for crypto is from
mcrypt_create_iv($lengthinbytes, MCRYPT_DEV_URANDOM);
but is outside the requirements of not having a crypto module
In SQL use the RAND() in conjunction with your generated number.
http://www.tutorialspoint.com/mysql/mysql-rand-function.htm
Php offer as well the Rand() function
http://php.net/manual/en/function.rand.php
they wont give you the same number so you could use both.

rn_rand() should be getting used not rand()

Tracking Memory Usage in PHP

I'm trying to track the memory usage of a script that processes URLs. The basic idea is to check that there's a reasonable buffer before adding another URL to a cURL multi handler. I'm using a 'rolling cURL' concept that processes a URLs data as the multi handler is running. This means I can keep N connections active by adding a new URL from a pool each time an existing URL processes and is removed.
I've used memory_get_usage() with some positive results. Adding the real_usage flag helped (not really clear on the difference between 'system' memory and 'emalloc' memory, but system shows larger numbers). memory_get_usage() does ramp up as URLs are added then down as the URL set is depleted. However, I just exceeded the 32M limit with my last memory check being ~18M.
I poll the memory usage each time cURL multi signals a request has returned. Since multiple requests may return at the same time, there's a chance a bunch of URLs returned data at the same time and actually jumped the memory usage that 14M. However, if memory_get_usage() is accurate, I guess that's what's happening.
[Update: Should have run more tests before asking I guess, increased php's memory limit (but left the 'safe' amount the same in the script) and the memory usage as reported did jump from below my self imposed limit of 25M to over 32M. Then, as expected slowly ramped down as URLs where not added. But I'll leave the question up: Is this the right way to do this?]
Can I trust memory_get_usage() in this way? Are there better alternative methods for getting memory usage (I've seen some scripts parse the output of shell commands)?

real_usage works this way:
Zend's memory manager does not use system malloc for every block it needs. Instead, it allocates a big block of system memory (in increments of 256K, can be changed by setting environment variable ZEND_MM_SEG_SIZE) and manages it internally. So, there are two kinds of memory usage:
How much memory the engine took from the OS ("real usage")
How much of this memory was actually used by the application ("internal usage")
Either one of these can be returned by memory_get_usage(). Which one is more useful for you depends on what you are looking into. If you're looking into optimizing your code in specific parts, "internal" might be more useful for you. If you're tracking memory usage globally, "real" would be of more use. memory_limit limits the "real" number, so as soon as all blocks that are permitted by the limit are taken from the system and the memory manager can't allocate a requested block, there the allocation fails. Note that "internal" usage in this case might be less than the limit, but the allocation still could fail because of fragmentation.
Also, if you are using some external memory tracking tool, you can set this
environment variable USE_ZEND_ALLOC=0 which would disable the above mechanism and make the engine always use malloc(). This would have much worse performance but allows you to use malloc-tracking tools.
See also an article about this memory manager, it has some code examples too.

I also assume memory_get_usage() is safe but I guess you can compare both methods and decide for yourself, here is a function that parses the system calls:
function Memory_Usage($decimals = 2)
{
$result = 0;
if (function_exists('memory_get_usage'))
{
$result = memory_get_usage() / 1024;
}
else
{
if (function_exists('exec'))
{
$output = array();
if (substr(strtoupper(PHP_OS), 0, 3) == 'WIN')
{
exec('tasklist /FI "PID eq ' . getmypid() . '" /FO LIST', $output);
$result = preg_replace('/[\D]/', '', $output[5]);
}
else
{
exec('ps -eo%mem,rss,pid | grep ' . getmypid(), $output);
$output = explode(' ', $output[0]);
$result = $output[1];
}
}
}
return number_format(intval($result) / 1024, $decimals, '.', '');
}

Use xdebug, as it was recently (January of 29th) updated to now include memory profiling information. It keeps track of the function calls and how much memory they consume. This allows you to get very insightful view into your code and at the very least sets you in a direction of being aware of the problems.
The documentation is helpful, but essentially you, install it enable the profiling xdebug.profiler_enable = 1 and give the output xdebug.profiler_output_dir=/some/path to a tool such as qcachegrind to do the heavy lifting, letting visually see it.

Well I have never really had a memory problem with my PHP scripts so I do not think I could be of much help finding the cause of the problem but what I can recomend is that you get a PHP accelerator, you will notice a serious performance increase and memory usage with decline. Here is a list of accelerators and an article comparing a few of them (3x better performance with any of them)
Wikipedia List
Benchmark
The benchmarks are 2 years old but you get the idea of the performance increases.
If you have to you can also increase you memory limit in PHP if you are still having problems even with the accelerator. Open up your php.ini and find:
memory_limit = 32M;
and just increase it a little.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.