Converting indentation with preg_replace (no callback) - php

I have some XML chunk returned by DOMDocument::saveXML(). It's already pretty indented, with two spaces per level, like so:
<?xml version="1.0"?>
<root>
<error>
<a>eee</a>
<b>sd</b>
</error>
</root>
As it's not possible to configure DOMDocument (AFAIK) about the indentation character(s), I thought it's possible to run a regular expression and change the indentation by replacing all two-space-pairs into a tab. This can be done with a callback function (Demo):
$xml_string = $doc->saveXML();
function callback($m)
{
$spaces = strlen($m[0]);
$tabs = $spaces / 2;
return str_repeat("\t", $tabs);
}
$xml_string = preg_replace_callback('/^(?:[ ]{2})+/um', 'callback', $xml_string);
I'm now wondering if it's possible to do this w/o a callback function (and without the e-modifier (EVAL)). Any regex wizards with an idea?

You can use \G:
preg_replace('/^ |\G /m', "\t", $string);
Did some benchmarks and got following results on Win32 with PHP 5.2 and 5.4:
>php -v
PHP 5.2.17 (cli) (built: Jan 6 2011 17:28:41)
Copyright (c) 1997-2010 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2010 Zend Technologies
>php -n test.php
XML length: 21100
Iterations: 1000
callback: 2.3627231121063
\G: 1.4221360683441
while: 3.0971200466156
/e: 7.8781840801239
>php -v
PHP 5.4.0 (cli) (built: Feb 29 2012 19:06:50)
Copyright (c) 1997-2012 The PHP Group
Zend Engine v2.4.0, Copyright (c) 1998-2012 Zend Technologies
>php -n test.php
XML length: 21100
Iterations: 1000
callback: 1.3771259784698
\G: 1.4414191246033
while: 2.7389969825745
/e: 5.5516891479492
Surprising that callback is faster than than \G in PHP 5.4 (altho that seems to depend on the data, \G is faster in some other cases).
For \G /^ |\G /m is used, and is a bit faster than /(?:^|\G) /m.
/(?>^|\G) /m is even slower than /(?:^|\G) /m.
/u, /S, /X switches didn't affect \G performance noticeably.
The while replace is fastest if depth is low (up to about 4 indentations, 8 spaces, in my test), but then gets slower as the depth increases.
The following code was used:
<?php
$base_iter = 1000;
$xml_string = str_repeat(<<<_STR_
<?xml version="1.0"?>
<root>
<error>
<a> eee </a>
<b> sd </b>
<c>
deep
deeper still
deepest !
</c>
</error>
</root>
_STR_
, 100);
//*** while ***
$re = '%# Match leading spaces following leading tabs.
^ # Anchor to start of line.
(\t*) # $1: Preserve any/all leading tabs.
[ ]{2} # Match "n" spaces.
%mx';
function conv_indent_while($xml_string) {
global $re;
while(preg_match($re, $xml_string))
$xml_string = preg_replace($re, "$1\t", $xml_string);
return $xml_string;
}
//*** \G ****
function conv_indent_g($string){
return preg_replace('/^ |\G /m', "\t", $string);
}
//*** callback ***
function callback($m)
{
$spaces = strlen($m[0]);
$tabs = $spaces / 2;
return str_repeat("\t", $tabs);
}
function conv_indent_callback($str){
return preg_replace_callback('/^(?:[ ]{2})+/m', 'callback', $str);
}
//*** callback /e ***
function conv_indent_e($str){
return preg_replace('/^(?: )+/me', 'str_repeat("\t", strlen("$0")/2)', $str);
}
//*** tests
function test2() {
global $base_iter;
global $xml_string;
$t = microtime(true);
for($i = 0; $i < $base_iter; ++$i){
$s = conv_indent_while($xml_string);
if(strlen($s) >= strlen($xml_string))
exit("strlen invalid 2");
}
return (microtime(true) - $t);
}
function test1() {
global $base_iter;
global $xml_string;
$t = microtime(true);
for($i = 0; $i < $base_iter; ++$i){
$s = conv_indent_g($xml_string);
if(strlen($s) >= strlen($xml_string))
exit("strlen invalid 1");
}
return (microtime(true) - $t);
}
function test0(){
global $base_iter;
global $xml_string;
$t = microtime(true);
for($i = 0; $i < $base_iter; ++$i){
$s = conv_indent_callback($xml_string);
if(strlen($s) >= strlen($xml_string))
exit("strlen invalid 0");
}
return (microtime(true) - $t);
}
function test3(){
global $base_iter;
global $xml_string;
$t = microtime(true);
for($i = 0; $i < $base_iter; ++$i){
$s = conv_indent_e($xml_string);
if(strlen($s) >= strlen($xml_string))
exit("strlen invalid 02");
}
return (microtime(true) - $t);
}
echo 'XML length: ' . strlen($xml_string) . "\n";
echo 'Iterations: ' . $base_iter . "\n";
echo 'callback: ' . test0() . "\n";
echo '\G: ' . test1() . "\n";
echo 'while: ' . test2() . "\n";
echo '/e: ' . test3() . "\n";
?>

The following simplistic solution first comes to mind:
$xml_string = str_replace(' ', "\t", $xml_string);
But I assume, you would like to limit the replacement to leading whitespace only. For that case, your current solution looks pretty clean to me. That said, you can do it without a callback or the e modifier, but you need to run it recursively to get the job done like so:
$re = '%# Match leading spaces following leading tabs.
^ # Anchor to start of line.
(\t*) # $1: Preserve any/all leading tabs.
[ ]{2} # Match "n" spaces.
%umx';
while(preg_match($re, $xml_string))
$xml_string = preg_replace($re, "$1\t", $xml_string);
Surprisingly, my testing shows this to be nearly twice as fast as the callback method. (I would have guessed the opposite.)
Note that Qtax has an elegant solution that works just fine (I gave it my +1). However, my benchmarks show it to be slower than the original callback method. I think this is because the expression /(?:^|\G) /um does not allow the regex engine to take advantage of the: "anchor at the beginning of the pattern" internal optimization. The RE engine is forced to test the pattern against each and every position in the target string. With pattern expressions beginning with the ^ anchor, the RE engine only needs to check at the beginning of each line which allows it to match much faster.
Excellent question! +1
Addendum/Correction:
I must apologize because the performance statements I made above are wrong. I ran the regexes against only one (non-representative) test file which had mostly tabs in the leading whitespace. When tested against a more realistic file having lots of leading spaces, my recursive method above performs significantly slower than the other two methods.
If anyone is interested, here is the benchmark script I used to measure the performance of each regex:
test.php
<?php // test.php 20120308_1200
require_once('inc/benchmark.inc.php');
// -------------------------------------------------------
// Test 1: Recursive method. (ridgerunner)
function tabify_leading_spaces_1($xml_string) {
$re = '%# Match leading spaces following leading tabs.
^ # Anchor to start of line.
(\t*) # $1: Any/all leading tabs.
[ ]{2} # Match "n" spaces.
%umx';
while(preg_match($re, $xml_string))
$xml_string = preg_replace($re, "$1\t", $xml_string);
return $xml_string;
}
// -------------------------------------------------------
// Test 2: Original callback method. (hakre)
function tabify_leading_spaces_2($xml_string) {
return preg_replace_callback('/^(?:[ ]{2})+/um', '_callback', $xml_string);
}
function _callback($m) {
$spaces = strlen($m[0]);
$tabs = $spaces / 2;
return str_repeat("\t", $tabs);
}
// -------------------------------------------------------
// Test 3: Qtax's elegantly simple \G method. (Qtax)
function tabify_leading_spaces_3($xml_string) {
return preg_replace('/(?:^|\G) /um', "\t", $xml_string);
}
// -------------------------------------------------------
// Verify we get the same results from all methods.
$data = file_get_contents('testdata.txt');
$data1 = tabify_leading_spaces_1($data);
$data2 = tabify_leading_spaces_2($data);
$data3 = tabify_leading_spaces_3($data);
if ($data1 == $data2 && $data2 == $data3) {
echo ("GOOD: Same results.\n");
} else {
exit("BAD: Different results.\n");
}
// Measure and print the function execution times.
$time1 = benchmark_12('tabify_leading_spaces_1', $data, 2, true);
$time2 = benchmark_12('tabify_leading_spaces_2', $data, 2, true);
$time3 = benchmark_12('tabify_leading_spaces_3', $data, 2, true);
?>
The above script uses the following handy little benchmarking function I wrote some time ago:
benchmark.inc.php
<?php // benchmark.inc.php
/*----------------------------------------------------------------------------
function benchmark_12($funcname, $p1, $reptime = 1.0, $verbose = true, $p2 = NULL) {}
By: Jeff Roberson
Created: 2010-03-17
Last edited: 2012-03-08
Discussion:
This function measures the time required to execute a given function by
calling it as many times as possible within an allowed period == $reptime.
A first pass determines a rough measurement of function execution time
by increasing the $nreps count by a factor of 10 - (i.e. 1, 10, 100, ...),
until an $nreps value is found which takes more than 0.01 secs to finish.
A second pass uses the value determined in the first pass to compute the
number of reps that can be performed within the allotted $reptime seconds.
The second pass then measures the time required to call the function the
computed number of times (which should take about $reptime seconds). The
average function execution time is then computed by dividing the total
measured elapsed time by the number of reps performed in that time, and
then all the pertinent values are returned to the caller in an array.
Note that this function is limited to measuring only those functions
having either one or two arguments that are passed by value and
not by reference. This is why the name of this function ends with "12".
Variations of this function can be easily cloned which can have more
than two parameters.
Parameters:
$funcname: String containing name of function to be measured. The
function to be measured must take one or two parameters.
$p1: First argument to be passed to $funcname function.
$reptime Target number of seconds allowed for benchmark test.
(float) (Default=1.0)
$verbose Boolean value determines if results are printed.
(bool) (Default=true)
$p2: Second (optional) argument to be passed to $funcname function.
Return value:
$result[] Array containing measured and computed values:
$result['funcname'] : $funcname - Name of function measured.
$result['msg'] : $msg - String with formatted results.
$result['nreps'] : $nreps - Number of function calls made.
$result['time_total'] : $time - Seconds to call function $nreps times.
$result['time_func'] : $t_func - Seconds to call function once.
$result['result'] : $result - Last value returned by function.
Variables:
$time: Float epoch time (secs since 1/1/1970) or benchmark elapsed secs.
$i: Integer loop counter.
$nreps Number of times function called in benchmark measurement loops.
----------------------------------------------------------------------------*/
function benchmark_12($funcname, $p1, $reptime = 1.0, $verbose = false, $p2 = NULL) {
if (!function_exists($funcname)) {
exit("\n[benchmark1] Error: function \"{$funcname}()\" does not exist.\n");
}
if (!isset($p2)) { // Case 1: function takes one parameter ($p1).
// Pass 1: Measure order of magnitude number of calls needed to exceed 10 milliseconds.
for ($time = 0.0, $n = 1; $time < 0.01; $n *= 10) { // Exponentially increase $nreps.
$time = microtime(true); // Mark start time. (sec since 1970).
for ($i = 0; $i < $n; ++$i) { // Loop $n times. ($n = 1, 10, 100...)
$result = ($funcname($p1)); // Call the function over and over...
}
$time = microtime(true) - $time; // Mark stop time. Compute elapsed secs.
$nreps = $n; // Number of reps just measured.
}
$t_func = $time / $nreps; // Function execution time in sec (rough).
// Pass 2: Measure time required to perform $nreps function calls (in about $reptime sec).
if ($t_func < $reptime) { // If pass 1 time was not pathetically slow...
$nreps = (int)($reptime / $t_func); // Figure $nreps calls to add up to $reptime.
$time = microtime(true); // Mark start time. (sec since 1970).
for ($i = 0; $i < $nreps; ++$i) { // Loop $nreps times (should take $reptime).
$result = ($funcname($p1)); // Call the function over and over...
}
$time = microtime(true) - $time; // Mark stop time. Compute elapsed secs.
$t_func = $time / $nreps; // Average function execution time in sec.
}
} else { // Case 2: function takes two parameters ($p1 and $p2).
// Pass 1: Measure order of magnitude number of calls needed to exceed 10 milliseconds.
for ($time = 0.0, $n = 1; $time < 0.01; $n *= 10) { // Exponentially increase $nreps.
$time = microtime(true); // Mark start time. (sec since 1970).
for ($i = 0; $i < $n; ++$i) { // Loop $n times. ($n = 1, 10, 100...)
$result = ($funcname($p1, $p2)); // Call the function over and over...
}
$time = microtime(true) - $time; // Mark stop time. Compute elapsed secs.
$nreps = $n; // Number of reps just measured.
}
$t_func = $time / $nreps; // Function execution time in sec (rough).
// Pass 2: Measure time required to perform $nreps function calls (in about $reptime sec).
if ($t_func < $reptime) { // If pass 1 time was not pathetically slow...
$nreps = (int)($reptime / $t_func); // Figure $nreps calls to add up to $reptime.
$time = microtime(true); // Mark start time. (sec since 1970).
for ($i = 0; $i < $nreps; ++$i) { // Loop $nreps times (should take $reptime).
$result = ($funcname($p1, $p2)); // Call the function over and over...
}
$time = microtime(true) - $time; // Mark stop time. Compute elapsed secs.
$t_func = $time / $nreps; // Average function execution time in sec.
}
}
$msg = sprintf("%s() Nreps:%7d Time:%7.3f s Function time: %.6f sec\n",
$funcname, $nreps, $time, $t_func);
if ($verbose) echo($msg);
return array('funcname' => $funcname, 'msg' => $msg, 'nreps' => $nreps,
'time_total' => $time, 'time_func' => $t_func, 'result' => $result);
}
?>
When I run test.php using the contents of benchmark.inc.php, here's the results I get:
GOOD: Same results.
tabify_leading_spaces_1() Nreps: 1756 Time: 2.041 s Function time: 0.001162 sec
tabify_leading_spaces_2() Nreps: 1738 Time: 1.886 s Function time: 0.001085 sec
tabify_leading_spaces_3() Nreps: 2161 Time: 2.044 s Function time: 0.000946 sec
Bottom line: I would recommend using Qtax's method.
Thanks Qtax!

Related

PHP : non-preg_match version of: preg_match("/[^a-z0-9]/i", $a, $match)?

Supposedly string is:
$a = "abc-def"
if (preg_match("/[^a-z0-9]/i", $a, $m)){
$i = "i stopped scanning '$a' because I found a violation in it while
scanning it from left to right. The violation was: $m[0]";
}
echo $i;
example above: should indicate "-" was the violation.
I would like to know if there is a non-preg_match way of doing this.
I will likely run benchmarks if there is a non-preg_match way of doing this perhaps 1000 or 1 million runs, to see which is faster and more efficient.
In the benchmarks "$a" will be much longer.
To ensure it is not trying to scan the entire "$a" and to ensure it stops soon as it detects a violation within the "$a"
Based on information I have witnessed on the internet, preg_match stops when the first match is found.
UPDATE:
this is based on the answer that was given by "bishop" and will likely to be chosen as the valid answer soon ( shortly ).
i modified it a little bit because i only want it to report the violator character. but i also commented that line out so benchmark can run without entanglements.
let's run a 1 million run based on that answer.
$start_time = microtime(TRUE);
$count = 0;
while ($count < 1000000){
$allowed = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
$input = 'abc-def';
$validLen = strspn($input, $allowed);
if ($validLen < strlen($input)){
#echo "violation at: ". substr($input, $validLen,1);
}
$count = $count + 1;
};
$end_time = microtime(TRUE);
$dif = $end_time - $start_time;
echo $dif;
the result is: 0.606614112854
( 60 percent of a second )
let's do it with the preg_match method.
i hope everything is the same. ( and fair )..
( i say this because there is the ^ character in the preg_match )
$start_time = microtime(TRUE);
$count = 0;
while ($count < 1000000){
$input = 'abc-def';
preg_match("/[^a-z0-9]/i", $input, $m);
#echo "violation at:". $m[0];
$count = $count + 1;
};
$end_time = microtime(TRUE);
$dif = $end_time - $start_time;
echo $dif;
i use "dif" in reference to the terminology "difference".
the "dif" was.. 1.1145210266113
( took 11 percent more than a whole second )
( if it was 1.2 that would mean it is 2x slower than the php way )
You want to find the location of the first character not in the given range, without using regular expressions? You might want strspn or its complement strcspn:
$allowed = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
$input = 'abc-def';
$validLen = strspn($input, $allowed);
if (strlen($input) !== $validLen) {
printf('Input invalid, starting at %s', substr($input, $validLen));
} else {
echo 'Input is valid';
}
Outputs Input invalid, starting at -def. See it live.
strspn (and its complement) are very old, very well specified (POSIX even). The standard implementations are optimized for this task. PHP just leverages that platform implementation, so PHP should be fast, too.

most efficient means of parsing a simple clearly defined string?

I'm only asking because this is looping millions of times.
string is simply like this:
01-20
Its always like that... 2 digits (leading zero) followed by hyphen and another 2 digits (leading zero). I simply need to assign the first (as integer) to one variable and the second (as integer) to another variable.
str_split? substr? explode? regex?
Given a variable $txt, this has the best performance:
$a = (int)$txt;
$b = (int)substr($txt, -2);
You could measure the performance of different alternatives with a script like this:
<?php
$txt = "01-02";
$test_count = 4000000;
// SUBSTR -2
$time_start = microtime(true);
for ($x = 0; $x <= $test_count; $x++) {
$a = (int)$txt; // numeric conversion ignores second part of string.
$b = (int)substr($txt, -2);
}
$duration = round((microtime(true) - $time_start) * 1000);
echo "substr(s,-2): {$a} {$b}, process time: {$duration}ms <br />";
// SUBSTR 3, 2
$time_start = microtime(true);
for ($x = 0; $x <= $test_count; $x++) {
$a = (int)$txt; // numeric conversion ignores second part of string.
$b = (int)substr($txt, 3, 2);
}
$duration = round((microtime(true) - $time_start) * 1000);
echo "substr(s,3,2): {$a} {$b}, process time: {$duration}ms <br />";
// STR_SPLIT
$time_start = microtime(true);
for ($x = 0; $x <= $test_count; $x++) {
$arr = str_split($txt, 3);
$a = (int)$arr[0]; // the ending hyphen does not break the numeric conversion
$b = (int)$arr[1];
}
$duration = round((microtime(true) - $time_start) * 1000);
echo "str_split(s,3): {$a} {$b}, process time: {$duration}ms <br />";
// EXPLODE
$time_start = microtime(true);
for ($x = 0; $x <= $test_count; $x++) {
$arr = explode('-', $txt);
$a = (int)$arr[0];
$b = (int)$arr[1];
}
$duration = round((microtime(true) - $time_start) * 1000);
echo "explode('-',s): {$a} {$b}, process time: {$duration}ms <br />";
// PREG_MATCH
$time_start = microtime(true);
for ($x = 0; $x <= $test_count; $x++) {
preg_match('/(..).(..)/', $txt, $arr);
$a = (int)$arr[1];
$b = (int)$arr[2];
}
$duration = round((microtime(true) - $time_start) * 1000);
echo "preg_match('/(..).(..)/',s): {$a} {$b}, process time: {$duration}ms <br />";
?>
When I ran this on PhpFiddle Lite I got results like this:
substr(s,-2): 1 2, process time: 851ms
substr(s,3,2): 1 2, process time: 971ms
str_split(s,3): 1 2, process time: 1568ms
explode('-',s): 1 2, process time: 1670ms
preg_match('/(..).(..)/',s): 1 2, process time: 3328ms
The performance of substr with either (s, -2) or (s, 3, 2) as arguments perform almost equally well, provided you use only one call. Sometimes the second version came out as the winner. str_split and explode perform rather close, but not as well, and preg_match is the clear looser. The results depend on the server load, so you should try this on your own set-up. But it is certain that regular expressions have a heavy payload. Avoid them when you can do the job with the other string functions.
I edited my answer when I realised that you can cast the original string immediately to int, which will ignore the part it cannot parse. This practically means you can get the first part as a number without calling any of the string functions. This was decisive to make substr the absolute winner!
Try to convert the string to an array then use each array index to different variable your want
<?php
$str = '01-20'
$number = explode('-',$str);
$variable_1 = (int)$number[0];
$variable_2 = (int)$number[1];
?>

PHP Runtime Issue when Breaking a 10,000 char string into segments

$chapter is a string that stores a chapter of a book with 10,000 - 15,000 characters. I want to break up the string into segments with a minimum of 1000 characters but officially break after the next whitespace, so that I don't break up a word. The provided code will run successfully about 9 times and then it will run into a run time issue.
"Fatal error: Maximum execution time of 30 seconds exceeded in D:\htdocs\test.php on line 16"
<?php
$chapter = ("10000 characters")
$len = strlen($chapter);
$i=0;
do{$key="a";
for($k=1000;($key != " ") && ($i <= $len); $k = $k+1) {
$j=$i+$k; echo $j;
$key = substr($chapter,$j,1);
}
$segment = substr ($chapter,$i,$k);
$i=$j;
echo ($segment);
} while($i <= $len);
?>
I think your method of writing it has too much overhead, while increasing max_execution_time will help, not everyone is able to modify their server settings. This simple thing split 15000 bytes of lorum ipsum text (2k Words) into 1000 character segments. I assume it would do well with more, as the execution time was fairly quick.
//Define variables, Set $x as int(1 = true) to start
$chapter = ("15000 bytes of Lorum Ipsum Here");
$sections = array();
$x = 1;
//Start Splitting
while( $x ) {
//Get current length of $chapter
$len = strlen($chapter);
//If $chapter is longer than 1000 characters
if( $len > 1000 ) {
//Get Position of last space character before 1000
$x = strrpos( substr( $chapter, 0, 1000), " ");
//If $x is not FALSE - Found last space
if( $x ) {
//Add to $sections array, assign remainder to $chapter again
$sections[] = substr( $chapter, 0, $x );
$chapter = substr( $chapter, $x );
//If $x is FALSE - No space in string
} else {
//Add last segment to $sections for debugging
//Last segment will not have a space. Break loop.
$sections[] = $chapter;
break;
}
//If remaining $chapter is not longer than 1000, simply add to array and break.
} else {
$sections[] = $chapter;
break;
}
}
print_r($sections);
Edit:
Tested with 5k Words (33K bytes) In a fraction of a second. Divided the text up into 33 segments. (Whoops, I had it set to divide into 10K character segments, before.)
Added verbose comments to code, as to explain what everything does.
Here is a simple function to do that
$chapter = "Your full chapter";
breakChapter($chapter,1000);
function breakChapter($chapter,$size){
do{
if(strlen($chapter)<$size){
$segment=$chapter;
$chapter='';
}else{
$pos=strpos($chapter,' ', $size);
if ($pos==false){
$segment=$chapter;
$chapter='';
}else{
$segment=substr($chapter,0,$pos);
$chapter=substr($chapter,$pos+1);
}
}
echo $segment. "\n";
}while ($chapter!='');
}
checking each character is not a good option and is resource/time intensive
PS: I have not tested this (just typed in here), and this may not be the best way to do this. but the logic works!
You are always reading the $chapter from the start. You should delete the already read characters from $chapter so you will never read much more than 10000 characters. If you do this, you must also tweak the cycles.
try
set_time_limit(240);
at the begining of the code. (this is the ThrowSomeHardwareAtIt aproach )
It can be done in just one single line, wich speeds up your code a lot.
echo $segment = substr($chapter, 0, strpos($chapter, " ", 1000));
It wil take the substring of the chapter until 1000 + some characters until the first space.

measuring the elapsed time between code segments in PHP

From time time to time, I'd like to be able to measure the elapsed time between two segments of code. This is solely to be able to detect the bottlenecks within the code and improve what can be improved.
I'd like to design a function like that where the function should work with a global variable which echoes out the elapsed time between the current call and the last time it was called.
This way, you can use it many times one after the other.
And the function should be able to be calculate the differences in fractions of seconds such as 0.1 sec or 0.3 sec etc.
An example would probably explain it much better.
echo time_elapsed();
// This echo outputs nothing cause this is the starting case.
// There is nothing to compare against.
//
// 1st code section here
//
echo time_elapsed();
// This echo outputs 0.5 seconds.
// ...which means there has been 0.5 seconds passed
// ...since the last time time_elapsed() was fired
//
// 2nd code section here
//
echo time_elapsed()
// This echo outputs 0.2 seconds
//
// 3rd code section here
//
echo time_elapsed()
// This echo outputs 0.1 seconds etc
My question is what PHP utilities ( built-in functions ) do I need to use to achieve this kind of output?
A debugger like XDebug/Zend Debugger can give you this type of insight (plus much more), but here is a hint at how you can write a function like that:
function time_elapsed()
{
static $last = null;
$now = microtime(true);
if ($last != null) {
echo '<!-- ' . ($now - $last) . ' -->';
}
$last = $now;
}
Mainly the function microtime() is all you need in order to do the time calculations. To avoid a global variable, I use a static variable within the elapsed function. Alternatively, you could create a simple class that can encapsulate the required variables and make calls to a class method to track and output the time values.
From the first example in the php docs:
<?php
/**
* Simple function to replicate PHP 5 behaviour
*/
function microtime_float()
{
list($usec, $sec) = explode(" ", microtime());
return ((float)$usec + (float)$sec);
}
$time_start = microtime_float();
// Sleep for a while
usleep(100);
$time_end = microtime_float();
$time = $time_end - $time_start;
echo "Did nothing in $time seconds\n";
Something along these lines should work:
$start = microtime(true);
// Do something
sleep(2);
$end = (microtime(true) - $start);
echo "elapsed time: $end";
Same drew010 function (thanks!), only added custom comment and time displays in microseconds (us):
function time_elapsed($comment)
{
static $time_elapsed_last = null;
static $time_elapsed_start = null;
// $unit="s"; $scale=1000000; // output in seconds
// $unit="ms"; $scale=1000; // output in milliseconds
$unit="μs"; $scale=1; // output in microseconds
$now = microtime(true);
if ($time_elapsed_last != null) {
echo "\n";
echo '<!-- ';
echo "$comment: Time elapsed: ";
echo round(($now - $time_elapsed_last)*1000000)/$scale;
echo " $unit, total time: ";
echo round(($now - $time_elapsed_start)*1000000)/$scale;
echo " $unit -->";
echo "\n";
} else {
$time_elapsed_start=$now;
}
$time_elapsed_last = $now;
}
Example:
// Start timer
time_elapsed('');
// Do something
usleep(100);
time_elapsed('Now awake, sleep again');
// Do something
usleep(100);
time_elapsed('Game over');
Ouput:
<!-- Now awake, sleep again: Time elapsed: 100 us, total time: 100 us -->
<!-- Game over: Time elapsed: 100 us, total time: 200 us -->
Other factors affect the timing of your scripts. Example:
Complex code and recursive functions.
The type of web server being used, example: shared VS dedicated hosting.
<?php
$time_start = microtime(true);
// Sleep for a while (or your code which you want to measure)
usleep(100);
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did nothing in $time seconds\n";
Source: http://php.net/manual/en/function.microtime.php#example-2568
My profiling needs in development are covered by this small yet powerful class:
<?php
class perflog {
protected $perflog = [];
public function start($tag) {
if (!isset($this->perflog[$tag])) $this->perflog[$tag] = 0;
$this->perflog[$tag] -= microtime(TRUE);
}
public function stop($tag) {
$this->perflog[$tag] += microtime(TRUE);
}
public function results() {
return $this->perflog;
}
}
See it in action here.
It is intended to be invoked via subsequent start(<tag>) and stop(<tag>) calls. It produces an array with the totalized times your code spent in the sections enclosed by the calls of start() and stop() with matching tags.
start-stop sequences may be nested and may be entered multiple times, thus summarizing the time spent in the enclosed section.
Its compactness ensures minimum performance impact. Dynamic tag creation can be used to let the program modify what it monitors. Typically this is extended with outputting or storing functions.

Fastest way of getting a character inside a string given the index (PHP)

I know of several ways to get a character off a string given the index.
<?php
$string = 'abcd';
echo $string[2];
echo $string{2};
echo substr($string, 2, 1);
?>
I don't know if there are any more ways, if you know of any please don't hesitate to add it. The question is, if I were to choose and repeat a method above a couple of million times, possibly using mt_rand to get the index value, which method would be the most efficient in terms of least memory consumption and fastest speed?
To arrive at an answer, you'll need to setup a benchmark test rig. Compare all methods over several (hundreds of thousands or millions) iterations on an idle box. Try the built-in microtime function to measure the difference between start and finish. That's your elapsed time.
The test should take you all of 2 minutes to write.
To save you some effort, I wrote a test. My own test shows that the functional solution (substr) is MUCH slower (expected). The idiomatic PHP ({}) solution is as fast as the index method. They are interchangeable. The ([]) is preferred, as this is the direction where PHP is going regarding string offsets.
<?php
$string = 'abcd';
$limit = 1000000;
$r = array(); // results
// PHP idiomatic string index method
$s = microtime(true);
for ($i = 0; $i < $limit; ++$i) {
$c = $string{2};
}
$r[] = microtime(true) - $s;
echo "\n";
// PHP functional solution
$s = microtime(true);
for ($i = 0; $i < $limit; ++$i) {
$c = substr($string, 2, 1);
}
$r[] = microtime(true) - $s;
echo "\n";
// index method
$s = microtime(true);
for ($i = 0; $i < $limit; ++$i) {
$c = $string[2];
}
$r[] = microtime(true) - $s;
echo "\n";
// RESULTS
foreach ($r as $i => $v) {
echo "RESULT ($i): $v \n";
}
?>
Results:
RESULT (PHP4 & 5 idiomatic braces syntax): 0.19106006622314
RESULT (string slice function): 0.50699090957642
RESULT (*index syntax, the future as the braces are being deprecated *): 0.19102001190186

Categories