strpos for arrays - php

Let's say I have 90 'indexes' in my array and I have a function which checks if that value exists in that array, would it be faster if i used strpos with a String instead?
Instead of using in_array() to
$data = array('John','Mary','Steven');
It will be
$data = 'John.Mary.Steven';
then I'll just strpos() on that String?

Without bothering to profile it, I'd say that imploding to a string followed by strpos would be slower than PHP's built-in in_array() function.... because you're adding all the overhead of converting the entire array (all 90 elements) to a string before you can even use strpos(). Premature Micro-optimisation isn't a good idea, unless you really need it, and then you should test your ideas.
EDIT
If you're using your own function instead of in_array(), it probably is slower, but raises the question "why"?

I was quite sure that use of strpos will be slower but I made a test below, and it looks like (at least in this particular case - searching for the last element) strpos is faster than in_array.
$array = array();
for($i=0;$i<10000;$i++) {
$array[] = md5($i . date('now'));
}
$string = implode('.', $array);
$lastElement = $array[9999];
$start = microtime(TRUE);
$isit = in_array($lastElement, $array);
$end = microtime(TRUE);
echo ($end - $start) . PHP_EOL;
$start = microtime(TRUE);
$pos = strpos($string, $lastElement);
$end = microtime(TRUE);
echo ($end - $start) . PHP_EOL;
Results I'm getting:
0.0012338161468506
0.00036406517028809

According to this test, looping your array and checking with strpos() would be slower than just using in_array(). They claim that in_array() is actually 2.4 times faster than doing a foreach loop with strpos().
On the other hand, a question here on SO seems to indicate otherwise.
Efficiency of Searching an Array Vs Searching in Text.... Which is Better?
If I were you, I would run my own performance tests to see what works best with my specific set of data.

Related

Is there any manual method other than "str_repeat" to repeat the string?

I mean if we give 3, b as parameters passed into function, it should return "bbb" by using loops.
I've tried some code, but I do not want to post it because it might look crazy for a well-versed developer. I can provide you links, this question has been asked in an interview, mainly they want it to be computed in C or C++. Since I am a PHP practitioner, I am curious to know it is possible in PHP. Below is the link (ROUND 2: SIMPLE CODING(3 hours))
https://www.geeksforgeeks.org/zoho-interview-set-3-campus/
A PHP function to do that would probably look like this:
function string_repeat($num, $string)
{
$result = "";
for ($x = 0; $x < $num; $x++) {
$result .= $string;
}
return $result;
}
So calling echo string_repeat(3, 'b'); would output:
bbb
One way would be to keep around a "dummy" string, of sufficient length to be longer than any string you want to generate. Then, we can use preg_replace to replace each character with whatever the input is. Finally, we can substring that replace string to the desired length.
$dummy = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";
$length = 3;
$dummy = preg_replace('/./', 'b', $dummy);
$output = substr($dummy, 0, $length);
echo $output;
This prints:
bbb
You could wrap this into a helper function, if you really wanted to do that. Note that internally the regex engine is most likely doing some looping of its own, so we are not really freeing ourselves from looping, just hiding it from the current context.

PHP: Compare the start of two strings

I am wondering if there is a simple way to, in PHP, compare two strings and returns the amount of characters they have in common from the start of the string.
An example:
$s1 = "helloworld";
$s1 = "hellojohn";
These two strings both start with 'hello', which means that both strings have the first 5 characters in common. '5' is the value I'd like to recieve when comparing these two strings.
Is there a computationally fast way of doing this without comparing both strings as arrays to eachother?
function commonChars($s1, $s2) {
$IMAX = min(strlen($s1), strlen($s2));
for($i = 0; $i < $IMAX; $i++)
if($s2[i] != $s1[i]) break;
return $i;
}
If the strings are really big, then I would write my own binary search. Something similar to this totally untested code that I just dreamed up.
function compareSection($start, $end, $string1, $string2) {
$substr1 = substr($string1, $start, $end-$start);
$substr2 = substr($string2, $start, $end-$start);
if ($substr1 == $substr2) return $end;
if ($firstMatches = compareSection(0, $end/2, $substr1, $substr2)) {
return $start + $firstMatches;
if ($lastMatches = compareSection($end/2, $end, $substr, $substr2)) {
return $start+$lastMatches;
}
}
If it's the similarity of the strings you wish to get and not just the actual number of identical characters, there are two functions for that:strcmp and levenshtein. Maybe they suit your goal more than what you asked for in this question.
From my knowledge, I don't think there is a built in function for something like this. Most likely, you will have to make your own.
Shouldn't be too hard. Just loop both strings by index by index until you don't find a match that doesn't match. However far you got is the answer.
Hope that helps!

What is an elegant way to accomplish: explode -> apply some_func -> implode without a foreach?

I have a string of delimiter-separated values, and need to perform some_func on each of the values in the string. What is the most compact/elegant way to accomplish this? Here is the code I would currently use:
$delim = ',';
$source_array = explode($delim, $source_string);
$destination_array = array();
foreach ($source_array as $val) {
$destination_array[] = some_function($val);
}
$destination_string = implode($delim, $destination_array);
Ordering is not important, but would be nice to preserve.
Thanks!
You're looking for array_map:
$delim = ',';
$source_array = explode($delim, $source_string);
$destination_array = array_map('some_function', source_array)
$destination_string = implode($delim, $destination_array);
Try regular expressions/ REGEX. Or if the algorithm is heavy make a script in Perl to handle all this and call it from PHP. It will make things easier.
Well, i know, someone will tell you Python or another scripting language, but thats my 2 cents.
You could use array_walk.
This way you can apply a function to each entry in the array. So your code would look like this:
$delim = ',';
$parts = explode($delim, $source_string);
array_walk($parts, 'some_function');
$destination_string = implode($delim, $parts);
Compared to array_map, it won't generate a new array but work on the existing array... but that's only going to be significant if you work with huge arrays (think memory). And of course it won't work if you have to access other array values in some_function...

Is this a fair method of algorithm comparison?

I wanted to convert an array to lowercase and was wondering the most efficient method. I came up with two options, one using array_walk and one using foreach and wanted to compare them. Is this the best way to compare the two? Is there an even more efficient method that I have overlooked?
<?
$a = array_fill(0, 200000, genRandomString());
$b = array_fill(0, 200000, genRandomString());
$t = microtime(true);
array_walk($a, create_function('&$a', '$a = strtolower($a);'));
echo "array_walk: ".(microtime(true) - $t);
echo "<br />";
$t = microtime(true);
foreach($b as &$source) { $source = strtolower($source); }
echo "foreach: ".(microtime(true) - $t);
function genRandomString($length = 10) {
$characters = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
$string = '';
for ($p = 0; $p < $length; $p++) {
$string .= $characters[mt_rand(0, strlen($characters)-1)];
}
return $string;
}
The output:
array_walk: 0.52975487709045
foreach: 0.29656505584717
Two questions in one!
How to run the tests:
Personally, I'd write individual test scripts for each method, then use the Apache ab utility to run the tests:
ab -n 100 -c 1 http://localhost/arrayWalkTest.php
ab -n 100 -c 1 http://localhost/foreachTest.php
That gives me a much more detailed set of statistics for comparison
I'd also try to ensure that the two methods were working on identical datasets for each test, not different random data.
The most efficient method:
You should unset($source) after your loop as a safety measure: because you're accessing by reference in the loop, $source will still contain a reference to the last entry in the array and may give you unpredictable results if you reference $source anywhere else in your script.
I had lots of weird results in the past when using the microtime approach over using a dedicated profiler, like it exists in XDebug or Zend_Debugger. Also, for a fair comparison your arrays should be identical instead of two random arrays.
In addition, you could consider using array_map and strtolower:
$a = array_map('strtolower', $a);
which would save you the lambda for array_walk. Anonymous functions created with create_function (unlike PHP 5.3's anonymous functions) are known to be slow and strtolower is a native function, so using it directly should be faster.
I did a quick benchmark and I dont see any relevant speed difference between this approach and your foreach. Like so often, I'd say it's a µ-opt. Of course, you should test that in a real world application if you think it matters. Synthetic benchmarks are fun, but ultimately useless.
On a sidenote, to change the array keys, you can use
array_change_key_case — Changes all keys in an array
I don't know PHP, so this is a wild guess:
str_split(strtolower(implode("", $a)))

Which is faster in PHP, $array[] = $value or array_push($array, $value)?

What's better to use in PHP for appending an array member,
$array[] = $value;
or
array_push($array, $value);
?
Though the manual says you're better off to avoid a function call, I've also read $array[] is much slower than array_push(). What are some clarifications or benchmarks?
I personally feel like $array[] is cleaner to look at, and honestly splitting hairs over milliseconds is pretty irrelevant unless you plan on appending hundreds of thousands of strings to your array.
I ran this code:
$t = microtime(true);
$array = array();
for($i = 0; $i < 10000; $i++) {
$array[] = $i;
}
print microtime(true) - $t;
print '<br>';
$t = microtime(true);
$array = array();
for($i = 0; $i < 10000; $i++) {
array_push($array, $i);
}
print microtime(true) - $t;
The first method using $array[] is almost 50% faster than the second one.
Some benchmark results:
Run 1
0.0054171085357666 // array_push
0.0028800964355469 // array[]
Run 2
0.0054559707641602 // array_push
0.002892017364502 // array[]
Run 3
0.0055501461029053 // array_push
0.0028610229492188 // array[]
This shouldn't be surprising, as the PHP manual notes this:
If you use array_push() to add one element to the array it's better to use $array[] = because in that way there is no overhead of calling a function.
The way it is phrased I wouldn't be surprised if array_push is more efficient when adding multiple values. Out of curiosity, I did some further testing, and even for a large amount of additions, individual $array[] calls are faster than one big array_push. Interesting.
The main use of array_push() is that you can push multiple values onto the end of the array.
It says in the documentation:
If you use array_push() to add one
element to the array it's better to
use $array[] = because in that way
there is no overhead of calling a
function.
From the PHP documentation for array_push:
Note: If you use array_push() to add one element to the array it's better to use $array[] = because in that way there is no overhead of calling a function.
Word on the street is that [] is faster because no overhead for the function call. Plus, no one really likes PHP's array functions...
"Is it...haystack, needle....or is it needle haystack...ah, f*** it...[] = "
One difference is that you can call array_push() with more than two parameters, i.e. you can push more than one element at a time to an array.
$myArray = array();
array_push($myArray, 1,2,3,4);
echo join(',', $myArray);
prints 1,2,3,4
A simple $myarray[] declaration will be quicker as you are just pushing an item onto the stack of items due to the lack of overhead that a function would bring.
Since "array_push" is a function and it called multiple times when it is inside the loop, it will allocate memory into the stack.
But when we are using $array[] = $value then we are just assigning a value to the array.
Second one is a function call so generally it should be slower than using core array-access features. But I think even one database query within your script will outweight 1000000 calls to array_push().
See here for a quick benchmark using 1000000 inserts: https://3v4l.org/sekeV
I just wan't to add : int array_push(...) returns
the new number of elements in the array (PHP documentation). which can be useful and more compact than $myArray[] = ...; $total = count($myArray);.
Also array_push(...) is meaningful when variable is used as a stack.

Categories