I have a large array.
In this array I have got (among many other things) a list of products:
$data['product_name_0'] = '';
$data['product_desc_0'] = '';
$data['product_name_1'] = '';
$data['product_desc_1'] = '';
This array is provided by a third party (so I have no control over this).
It is not known how many products there will be in the array.
What would be a clean way to loop though all the products?
I don't want to use a foreach loop since it will also go through all the other items in the (large) array.
I cannot use a for loop cause I don't know (yet) how many products the array contains.
I can do a while loop:
$i = 0;
while(true) { // doing this feels wrong, although it WILL end at some time (if there are no other products)
if (!array_key_exists('product_name_'.$i, $data)) {
break;
}
// do stuff with the current product
$i++;
}
Is there a cleaner way of doing the above?
Doing a while(true) looks stupid to me or is there nothing wrong with this approach.
Or perhaps there is another approach?
Your method works, as long as the numeric portions are guaranteed to be sequential. If there's gaps, it'll miss anything that comes after the first gap.
You could use something like:
$names = preg_grep('/^product_name_\d+$/', array_keys($data));
which'll return all of the 'name' keys from your array. You'd extract the digit portion from the key name, and then can use that to refer to the 'desc' section as well.
foreach($names as $name_field) {
$id = substr($names, 12);
$name_val = $data["product_name_{$id}"];
$desc_val = $data["product_desc_{$id}"];
}
How about this
$i = 0;
while(array_key_exists('product_name_'.$i, $data)) {
// loop body
$i++;
}
I think you're close. Just put the test in the while condition.
$i = 0;
while(array_key_exists('product_name_'.$i, $data)) {
// do stuff with the current product
$i++;
}
You might also consider:
$i = 0;
while(isset($data['product_name_'.$i])) {
// do stuff with the current product
$i++;
}
isset is slightly faster than array_key_exists but does behave a little different, so may or may not work for you:
What's quicker and better to determine if an array key exists in PHP?
Difference between isset and array_key_exists
Related
I am looking for a way to create new arrays in a loop. Not the values, but the array variables. So far, it looks like it's impossible or complicated, or maybe I just haven't found the right way to do it.
For example, I have a dynamic amount of values I need to append to arrays. Let's say it will be 200 000 values. I cannot assign all of these values to one array, for memory reasons on server, just skip this part.
I can assign a maximum amount of 50 000 values per one array. This means, I will need to create 4 arrays to fit all the values in different arrays. But next time, I will not know how many values I need to process.
Is there a way to generate a required amount of arrays based on fixed capacity of each array and an amount of values? Or an array must be declared manually and there is no workaround?
What I am trying to achieve is this:
$required_number_of_arrays = ceil(count($data)/50000);
for ($i = 1;$i <= $required_number_of_arrays;$i++) {
$new_array$i = array();
foreach ($data as $val) {
$new_array$i[] = $val;
}
}
// Created arrays: $new_array1, $new_array2, $new_array3
A possible way to do is to extend ArrayObject. You can build in limitation of how many values may be assigned, this means you need to build a class instead of $new_array$i = array();
However it might be better to look into generators, but Scuzzy beat me to that punchline.
The concept of generators is that with each yield, the previous reference is inaccessible unless you loop over it again. It will be in a way, overwritten unlike in arrays, where you can always traverse over previous indexes using $data[4].
This means you need to process the data directly. Storing the yielded data into a new array will negate its effects.
Fetching huge amounts of data is no issue with generators but one should know the concept of them before using them.
Based on your comments, it sounds like you don't need separate array variables. You can reuse the same one. When it gets to the max size, do your processing and reinitialize it:
$max_array_size = 50000;
$n = 1;
$new_array = [];
foreach ($data as $val) {
$new_array[] = $val;
if ($max_array_size == $n++) {
// process $new_array however you need to, then empty it
$new_array = [];
$n = 1;
}
}
if ($new_array) {
// process the remainder if the last bit is less than max size
}
You could create an array and use extract() to get variables from this array:
$required_number_of_arrays = ceil($data/50000);
$new_arrays = array();
for ($i = 1;$i <= $required_number_of_arrays;$i++) {
$new_arrays["new_array$i"] = $data;
}
extract($new_arrays);
print_r($new_array1);
print_r($new_array2);
//...
I think in your case you have to create an array that holds all your generated arrays insight.
so first declare a variable before the loop.
$global_array = [];
insight the loop you can generate the name and fill that array.
$global_array["new_array$i"] = $val;
After the loop you can work with that array. But i think in the end that won't fix your memory limit problem. If fill 5 array with 200k entries it should be the same as filling one array of 200k the amount of data is the same. So it's possible that you run in both ways over the memory limit. If you can't define the limit it could be a problem.
ini_set('memory_limit', '-1');
So you can only prevent that problem in processing your values directly without saving something in an array. For example if you run a db query and process the values directly and save only the result.
You can try something like this:
foreach ($data as $key => $val) {
$new_array$i[] = $val;
unset($data[$key]);
}
Then your value is stored in a new array and you delete the value of the original data array. After 50k you have to create a new one.
Easier way use array_chunk to split your array into parts.
https://secure.php.net/manual/en/function.array-chunk.php
There's non need for multiple variables. If you want to process your data in chunks, so that you don't fill up memory, reuse the same variable. The previous contents of the variable will be garbage collected when you reassign it.
$chunk_size = 50000;
$number_of_chunks = ceil($data_size/$chunk_size);
for ($i = 0; $i < $data_size; $i += $chunk_size) {
$new_array = array();
foreach ($j = $i * $chunk_size; $j < min($j + chunk_size, $data_size); $j++) {
$new_array[] = get_data_item($j);
}
}
$new_array[$i] serves the same purpose as your proposed $new_array$i.
You could do something like this:
$required_number_of_arrays = ceil(count($data)/50000);
for ($i = 1;$i <= $required_number_of_arrays;$i++) {
$array_name = "new_array_$i";
$$array_name = [];
foreach ($data as $val) {
${$array_name}[] = $val;
}
}
I have a piece of code running a simulation.
public function cleanUpHouses(\DeadStreet\ValueObject\House\Collection $collection)
{
$houses = $collection->getHouses();
$housesLength = count($houses);
$filterValues = false;
for($i = 0; $i < $housesLength; $i++) {
if(!$this->houseModel->hasBeenAttacked($houses[$i])) {
break;
}
$houses[$i]->setCurrentAttackers(0);
if($this->houseModel->requiresDestroying($houses[$i])) {
$houses[$i] = null;
$filterValues = true;
}
}
if($filterValues) {
$houses = array_values(array_filter($houses));
}
$collection->setHouses($houses);
return $collection;
}
However, $collection contains an array ($getHouses) of up to and over 1 million results, although it will never need to iterate over all of these results, the line $houses = array_values(array_filter($houses)) is taking ages due to the sheer size of the array, (up to 3 seconds each time this line is ran).
I have to keep the array index numeric, and there can be no null values in this array.
I was hoping unset($array[$i]) would shift the array elements after the element being unset 'down' in key, so if I was to unset($array[5]), then $array[6] would become $array[5], however it doesn't seem to work like this.
The break conditional is there because, on an iteration if the house under iteration hasn't been attacked, then it's safe to assume any other house after that in the array has also not been attacked.
Is there an optimal, less resource heavy way to achieve this?
I can't really restructure this at the moment as it's in unit tests and I need it finishing ASAP, the approach isn't great, but eh.
I think the most painless way to achieve this is something like this:
While you are looping through the array of houses, and you need to unset something in the array, you can cheat the loop itself.
if($this->houseModel->requiresDestroying($houses[$i])) {
// $houses[$i] = null;
// when you unset the $i house in the array,
// you can simply switch it with the last one in the array, keeping in mind,
// that this may break your logic with the break condition, so will want to change that as well.
$lastHouse = $houses[$housesLength - 1];
$houses[$i] = $lastHouse;
unset($houses[$housesLength - 1]);
$i--;
$housesLength--; // by doing the top two lines we would make the loop check the last house again.
$shouldBreak = false; // this will keep your logic with the break if later.
// $filterValues = true; // you can remove this line here.
}
You would want to set up a variable for the break condition before the for loop starts.
$shouldBreak = true;
for($i = 0; $i < $housesLength; $i++) {
...
And now for the condition itself
if(!$this->houseModel->hasBeenAttacked($houses[$i]) && true === $shouldBreak) {
break;
} else {
$shouldBreak = true; // we set $shouldBreak = false when we unset the last house,
// so we would want to keep checking the houses not to break the logic.
}
We only will remove the last element in the array, so it will be kept numeric.
If you have any array $p that you populated in a loop like so:
$p[] = array( "id"=>$id, "Name"=>$name);
What's the fastest way to search for John in the Name key, and if found, return the $p index? Is there a way other than looping through $p?
I have up to 5000 names to find in $p, and $p can also potentially contain 5000 rows. Currently I loop through $p looking for each name, and if found, parse it (and add it to another array), splice the row out of $p, and break 1, ready to start searching for the next of the 5000 names.
I was wondering if there if a faster way to get the index rather than looping through $p eg an isset type way?
Thanks for taking a look guys.
Okay so as I see this problem, you have unique ids, but the names may not be unique.
You could initialize the array as:
array($id=>$name);
And your searches can be like:
array_search($name,$arr);
This will work very well as native method of finding a needle in a haystack will have a better implementation than your own implementation.
e.g.
$id = 2;
$name= 'Sunny';
$arr = array($id=>$name);
echo array_search($name,$arr);
Echoes 2
The major advantage in this method would be code readability.
If you know that you are going to need to perform many of these types of search within the same request then you can create an index array from them. This will loop through the array once per index you need to create.
$piName = array();
foreach ($p as $k=>$v)
{
$piName[$v['Name']] = $k;
}
If you only need to perform one or two searches per page then consider moving the array into an external database, and creating the index there.
$index = 0;
$search_for = 'John';
$result = array_reduce($p, function($r, $v) use (&$index, $search_for) {
if($v['Name'] == $search_for) {
$r[] = $index;
}
++$index;
return $r;
});
$result will contain all the indices of elements in $p where the element with key Name had the value John. (This of course only works for an array that is indexed numerically beginning with 0 and has no “holes” in the index.)
Edit: Possibly even easier to just use array_filter, but that will not return the indices only, but all array element where Name equals John – but indices will be preserved:
$result2 = array_filter($p, function($elem) {
return $elem["Name"] == "John" ? true : false;
});
var_dump($result2);
What suits your needs better, resp. which one is maybe faster, is for you to figure out.
https://graph.facebook.com/search?q=tom&type=user&access_token=2227470867|2.AQD2FG3bzBMEiDV3.3600.1307905200.0-100001799728875|LowLfLcqSZ9YKujFEpIrlFNVZPQ
how to avoid repeat name in facebook people search? in the json code, there have 2 Thomas Lee. Thanks.
foreach ($status_list['data'] as $data) {
echo $data['name']; // not print the same name.
}
$names = Array();
foreach ($status_list['data'] as $data) {
$names[] = $data['name'];
}
$names = array_unique($names); // not print the same name.
foreach ($names as $name) {
echo $name;
}
Here's a fast mashup of how you remove duplicates:
<?php
function existsInArray($list, $key, $value){
foreach($list as $lkey => $lvalue){
if($lvalue[$key] == $value){
return true;
}
}
return false;
}
$sortedUsers = array();
foreach($status_list['data'] as $data){
if(!existsInArray($sortedUsers, "id", $data["id"])){
$sortedUsers[] = $data;
}
}
This will go through the array och users, check if each item exist with the same id in the sorted array. If it doesn't exist, it will be added to the sorted array. Then you have $sortedUsers which doesn't contain any duplicates.
Note: However, this is just proof of concept code. So there are probably a lot of performance optimization that could be done. Also, there are probably some built in functionality to which can do this with less user defined code. Why I showed this is to just explain the process.
Edit: Since this answer got accepted I feel obligated to show something which is a little more high quality than proof of concept code. Also because it got mentioned in the comments that it was inefficient.
So here's easy fix to make this much faster:
$sortedUsers = array();
foreach($status_list['data'] as $data){
$sortedUsers[$data["id"]] = $data;
}
This way it will just overwrite the duplicates and will take away the whole process of comparing each item. In worst case this will be O(n) where as the proof of concept code was O(n ^ (n / 2)) in worst case.
I have a PHP script which reads a large CSV and performs certain actions, but only if the "username" field is unique. The CSV is used in more than one script, so changing the input from the CSV to only contain unique usernames is not an option.
The very basic program flow (which I'm wondering about) goes like this:
$allUsernames = array();
while($row = fgetcsv($fp)) {
$username = $row[0];
if (in_array($username, $allUsernames)) continue;
$allUsernames[] = $username;
// process this row
}
Since this CSV could actually be quite large, it's that in_array bit which has got me thinking. The most ideal situation when searching through an array for a member is if it is already sorted, so how would you build up an array from scratch, keeping it in order? Once it is in order, would there be a more efficient way to search it than using in_array(), considering that it probably doesn't know the array is sorted?
Not keeping the array in order, but how about this kind of optimization? I'm guessing isset() for an array key should be faster than in_array() search.
$allUsernames = array();
while($row = fgetcsv($fp)) {
$username = $row[0];
if (isset($allUsernames[$username])) {
continue;
} else {
$allUsernames[$username] = true;
// do stuff
}
}
The way to build up an array from scratch in sorted order is an insertion sort. In PHP-ish pseudocode:
$list = []
for ($element in $elems_to_insert) {
$index = binary_search($element, $list);
insert_into_list($element, $list, $index);
}
Although, it might actually turn out to be faster to just create the array in unsorted order and then use quicksort (PHP's builtin sort functions use quicksort)
And to find an element in a sorted list:
function binary_search($list, $element) {
$start = 0;
$end = count($list);
while ($end - $start > 1) {
$mid = ($start + $end) / 2;
if ($list[$mid] < $element){
$start = $mid;
}
else{
$end = $mid;
}
}
return $end;
}
With this implementation you'd have to test $list[$end] to see if it is the element you want, since if the element isn't in the array, this will find the point where it should be inserted. I did it that way so it'd be consistent with the previous code sample. If you want, you could check $list[$end] === $element in the function itself.
The array type in php is an ordered map (php array type). If you pass in either ints or strings as keys, you will have an ordered map...
Please review item #6 in the above link.
in_array() does not benefit from having a sorted array. PHP just walks along the whole array as if it were a linked list.