How do I detect if a certan sequence of elements is present in an array? E.g. if I have the arrays and needle
$needle = array(1,1);
$haystack1 = array(0,1,0,0,0,1,1,0,1,0);
$haystack2 = array(0,0,0,0,1,0,1,0,0,1);
How does one detect if the subset $needle is present in e.g. $haystack1? This method should return TRUE for $haystack1 and FALSE for $haystack2.
Thanks for any suggestions!
Join the arrays, and check for the strpos of the needle.
if ( strpos( join($haystack1), join($needle) ) >= 0 ) {
echo "Items found";
}
Demo: http://codepad.org/F13DLWOI
Warning
This will not work for complicated items like objects or arrays within the haystack array. This method is best used with itesm like numbers and strings.
If they're always a single digit/character then you can convert all elements to strings, join by '', and use the regex or string functions to search.
For the specific case where no array element is a prefix of any other element (when both are converted to strings) then the already posted answers will work fine and probably be pretty fast.
Here's an approach that will work correctly in the general case:
function subarray_exists(array $needle, array $haystack) {
if (count($needle) > count($haystack)) {
return false;
}
$needle = array_values($needle);
$iterations = count($haystack) - count($needle) + 1;
for ($i = 0; $i < $iterations; ++$i) {
if (array_slice($haystack, $i, count($needle)) == $needle) {
return true;
}
}
return false;
}
See it in action.
Disclaimer: There are ways to write this function that I expect will make it execute much faster when you are searching huge haystacks, but for a first approach simple is good.
Related
I want to build an array in php that contains every possible capitalization permutation of a word. so it would be (pseudocode)
function permutate($word){
for ($i=0; $i<count($word); $i++){
...confused here...
array_push($myArray, $newWord)
}
return $myArray;
}
So say I put in "School" I should get an array back of
{school, School, sChool, SCHool, schOOl, ... SCHOOL}
I know of functions that capitalize the string or the first character, but I am really struggling with how to accomplish this.
This should do it for you:
function permute($word){
if(!$word)
return array($word);
$permutations = array();
foreach(permute(substr($word, 1)) as $permutation){
$lower = strtolower($word[0]);
$permutations[] = $lower . $permutation;
$upper = strtoupper($word[0]);
if($upper !== $lower)
$permutations[] = $upper . $permutation;
}
return $permutations;
}
Codepad Demo
However, for your particular use case there may be a better solution. As there are 2^n permutations for a string of length n. It will be infeasible to run this (or even to generate all those strings using any method at all) on a much longer string.
In reality you should probably be converting strings to one particular case before hashing them, before storing them in the database, if you want to do case-insensitive matching.
I am wondering if there is a simple way to, in PHP, compare two strings and returns the amount of characters they have in common from the start of the string.
An example:
$s1 = "helloworld";
$s1 = "hellojohn";
These two strings both start with 'hello', which means that both strings have the first 5 characters in common. '5' is the value I'd like to recieve when comparing these two strings.
Is there a computationally fast way of doing this without comparing both strings as arrays to eachother?
function commonChars($s1, $s2) {
$IMAX = min(strlen($s1), strlen($s2));
for($i = 0; $i < $IMAX; $i++)
if($s2[i] != $s1[i]) break;
return $i;
}
If the strings are really big, then I would write my own binary search. Something similar to this totally untested code that I just dreamed up.
function compareSection($start, $end, $string1, $string2) {
$substr1 = substr($string1, $start, $end-$start);
$substr2 = substr($string2, $start, $end-$start);
if ($substr1 == $substr2) return $end;
if ($firstMatches = compareSection(0, $end/2, $substr1, $substr2)) {
return $start + $firstMatches;
if ($lastMatches = compareSection($end/2, $end, $substr, $substr2)) {
return $start+$lastMatches;
}
}
If it's the similarity of the strings you wish to get and not just the actual number of identical characters, there are two functions for that:strcmp and levenshtein. Maybe they suit your goal more than what you asked for in this question.
From my knowledge, I don't think there is a built in function for something like this. Most likely, you will have to make your own.
Shouldn't be too hard. Just loop both strings by index by index until you don't find a match that doesn't match. However far you got is the answer.
Hope that helps!
I have array like this:
array('1224*', '543*', '321*' ...) which contains about 17,00 "masks" or prefixes.
I have a second array:
array('123456789', '123456788', '987654321' ....) which contain about 250,000 numbers.
Now, how can I efficiently match every number from the second array using the array of masks/prefixes?
[EDIT]
The first array contains only prefixes and every entry has only one * at the end.
Well, here's a solution:
Prelimary steps:
Sort array 1, cutting off the *'s.
Searching:
For each number in array 2 do
Find the first and last entry in array 1 of which the first character matches that of number (binary search).
Do the same for the second character, this time searching not the whole array but between first and last (binary search).
Repeat 2 for the nth character until a string is found.
This should be O(k*n*log(n)) where n is the average number length (in digits) and k the number of numbers.
Basically this is a 1 dimensional Radix tree, for optimal performance you should implement it, but it can be quite hard.
My two cents....
$s = array('1234*', '543*', '321*');
$f = array('123456789', '123456788', '987654321');
foreach ($f as $haystack) {
echo $haystack."<br>";
foreach ($s as $needle) {
$needle = str_replace("*","",$needle);
echo $haystack "- ".$needle.": ".startsWith($haystack, $needle)."<br>";
}
}
function startsWith($haystack, $needle) {
$length = strlen($needle);
return (substr($haystack, 0, $length) === $needle);
}
To improve performance it might be a good idea to sort both arrays first and to add an exit clause in the inner foreach loop.
By the way, the startWith-function is from this great solution in SO: startsWith() and endsWith() functions in PHP
Another option would to be use preg_grep in a loop:
$masks = array('1224*', '543*', '321*' ...);
$data = array('123456789', '123456788', '987654321' ....);
$matches = array();
foreach($masks as $mask) {
$mask = substr($mask, 0, strlen($masks) - 2); // strip off trailing *
$matches[$mask] = preg_grep("/^$mask/", $data);
}
No idea how efficient this would be, just offering it up as an alternative.
Although regex is not famous for being fast, I'd like to know how well preg_grep() can perform if the pattern is boiled down to its leanest form and only called once (not in a loop).
By removing longer masks which are covered by shorter masks, the pattern will be greatly reduced. How much will the reduction be? of course, I cannot say for sure, but with 17,000 masks, there are sure to be a fair amount of redundancy.
Code: (Demo)
$masks = ['1224*', '543*', '321*', '12245*', '5*', '122488*'];
sort($masks);
$needle = rtrim(array_shift($masks), '*');
$keep[] = $needle;
foreach ($masks as $mask) {
if (strpos($mask, $needle) !== 0) {
$needle = rtrim($mask, '*');
$keep[] = $needle;
}
}
// now $keep only contains: ['1224', '321', '5']
$numbers = ['122456789', '123456788', '321876543234567', '55555555555555555', '987654321'];
var_export(
preg_grep('~^(?:' . implode('|', $keep) . ')~', $numbers)
);
Output:
array (
0 => '122456789',
2 => '321876543234567',
3 => '55555555555555555',
)
Check out the PHP function array_intersect_key.
A non-empty zero-indexed array A consisting of N integers is given. The first covering prefix of array A is the smallest integer P such that $0 \leq P < N$ and such that every value that occurs in array A also occurs in sequence $A[0], A[1], \ldots, A[P]$.
For example, the first covering prefix of array A such that
A[0]=2 A[1]=2 A[2]=1 A[3]=0 A[4]=1
is 3, because sequence A[0], A[1], A[2], A[3] equal to 2, 2, 1, 0 contains all values that occur in array A.
Write a function
int ps(int[] A);
that given a zero-indexed non-empty array A consisting of N integers returns the first covering prefix of A. Assume that $N <= 1,000,000$. Assume that each element in the array is an integer in range [0..N-1].
For example, given array A such that A[0]=2 A[1]=2 A[2]=1 A[3]=0 A[4]=1
the function should return 3, as explained in the example above.
This is a very short solution. Pretty but won't scale well.
function ps($A) {
$cp = 0; // covering prefix
$unique = array_unique($A); // will preserve indexes
end($unique); // go to end of the array
$cp = key($unique); // get the key
return $cp;
}
Here's a simple way :
function covering_prefix ( $A ) {
$in=array();
$li=0;
$c=count($A);
for($i=0 ;$i<$c ; $i++){
if (!isset($in[$A[$i]])){
$in[$A[$i]]='1';
$li=$i;
}
}
return $li;
}
Here is a solution using ruby
def first_covering_prefix(a)
all_values = a.uniq
i = 0
a.each do |e|
all_values.delete(e)
if all_values.empty?
return i
end
i = i + 1
end
end
it's a 83% answer, because of use of in_array, a better solution already proposed by ronan
function solution($A) {
// write your code in PHP5
$in=array();
$li=0;
for ($i=0; $i < count($A); $i++) {
# code...
if (!in_array($A[$i], $in)){
$in[]=$A[$i];
$li=$i;
}
}
return $li;
}
I a string that is coming from my database table say $needle.
If te needle is not in my array, then I want to add it to my array.
If it IS in my array then so long as it is in only twice, then I still
want to add it to my array (so three times will be the maximum)
In order to check to see is if $needle is in my $haystack array, do I
need to loop through the array with strpos() or is there a quicker method ?
There are many needles in the table so I start by looping through
the select result.
This is the schematic of what I am trying to do...
$haystack = array();
while( $row = mysql_fetch_assoc($result)) {
$needle = $row['data'];
$num = no. of times $needle is in $haystack // $haystack is an array
if ($num < 3 ) {
$$haystack[] = $needle; // hopfully this adds the needle
}
} // end while. Get next needle.
Does anyone know how do I do this bit:
$num = no. of times $needle is in $haystack
thanks
You can use array_count_values() to first generate a map containing the frequency for each value, and then only increment the value if the value count in the map was < 3, for instance:
$original_values_count = array_count_values($values);
foreach ($values as $value)
if ($original_values_count[$value] < 3)
$values[] = $value;
As looping cannot be completely avoided, I'd say it's a good idea to opt for using a native PHP function in terms of speed, compared to looping all values manually.
Did you mean array_count_values() to return the occurrences of all the unique values?
<?php
$a=array("Cat","Dog","Horse","Dog");
print_r(array_count_values($a));
?>
The output of the code above will be:
Array (
[Cat] => 1,
[Dog] => 2,
[Horse] => 1
)
There is also array_map() function, which applies given function to every element of array.
Maybe something like the following? Just changing Miek's code a little.
$haystack_count = array_count_values($haystack);
if ($haystack_count[$needle] < 3)
$haystack[] = $needle;