This works. It's sort-of a generic csv file importer and key assigner. Looking for feedback how this approach could be made more elegant. I started learning php last week. This forum is fantastic.
<?php
$csvfilename = "sdb.csv";
$filekeys = array('SSID','EquipName','EquipTypeSignalName','elecpropID');
$records = InputCsvFile($csvfilename,$filekeys);
function InputCsvFile($filename,$filekeys){
$array1 = array_map('str_getcsv', file($filename));
foreach($array1 as $element){
$int1 = 0;
unset($array3);
foreach($filekeys as $key){
$array3[$key] = $element[$int1];
$int1++;}
$array2[] = $array3;}
return $array2;}
?>
Using array_map() is clever, but since you have to further process each row, is somewhat unnecessary. I would rewrite InputCsvFile like this:
function InputCsvFile($filename, array $columns) {
$expectedCols = count($columns);
$arr = [];
// NOTE: Confirm the file actually exists before getting its contents
$rows = file($filename);
foreach($rows as $row) {
$row = str_getcsv($row);
if (count($row) == $expectedCols)) {
$arr[] = array_combine($filekeys, $element);
} else {
// Handle the column count mismatch. The test is required because
// otherwise, array_combine will complain loudly.
}
}
return $arr;
}
Alternatively, since you're dealing with files, you could loop on fgetcsv(), rather than using file() + str_getcsv(). Using fgetcsv() will use less memory (since the entire file doesn't have to be read in entirely and be kept in memory through the duration of the iteration), which may or may not be a concern depending on your file sizes.
array_combine() (which incidentally, is one of my favorite functions in PHP) creates a new array given arrays of keys (your list of columns in your $filekeys array) and values (the processed rows from the csv), and is practically tailor-made for turning csv files into more usable arrays.
Related
I am looking for a way to create new arrays in a loop. Not the values, but the array variables. So far, it looks like it's impossible or complicated, or maybe I just haven't found the right way to do it.
For example, I have a dynamic amount of values I need to append to arrays. Let's say it will be 200 000 values. I cannot assign all of these values to one array, for memory reasons on server, just skip this part.
I can assign a maximum amount of 50 000 values per one array. This means, I will need to create 4 arrays to fit all the values in different arrays. But next time, I will not know how many values I need to process.
Is there a way to generate a required amount of arrays based on fixed capacity of each array and an amount of values? Or an array must be declared manually and there is no workaround?
What I am trying to achieve is this:
$required_number_of_arrays = ceil(count($data)/50000);
for ($i = 1;$i <= $required_number_of_arrays;$i++) {
$new_array$i = array();
foreach ($data as $val) {
$new_array$i[] = $val;
}
}
// Created arrays: $new_array1, $new_array2, $new_array3
A possible way to do is to extend ArrayObject. You can build in limitation of how many values may be assigned, this means you need to build a class instead of $new_array$i = array();
However it might be better to look into generators, but Scuzzy beat me to that punchline.
The concept of generators is that with each yield, the previous reference is inaccessible unless you loop over it again. It will be in a way, overwritten unlike in arrays, where you can always traverse over previous indexes using $data[4].
This means you need to process the data directly. Storing the yielded data into a new array will negate its effects.
Fetching huge amounts of data is no issue with generators but one should know the concept of them before using them.
Based on your comments, it sounds like you don't need separate array variables. You can reuse the same one. When it gets to the max size, do your processing and reinitialize it:
$max_array_size = 50000;
$n = 1;
$new_array = [];
foreach ($data as $val) {
$new_array[] = $val;
if ($max_array_size == $n++) {
// process $new_array however you need to, then empty it
$new_array = [];
$n = 1;
}
}
if ($new_array) {
// process the remainder if the last bit is less than max size
}
You could create an array and use extract() to get variables from this array:
$required_number_of_arrays = ceil($data/50000);
$new_arrays = array();
for ($i = 1;$i <= $required_number_of_arrays;$i++) {
$new_arrays["new_array$i"] = $data;
}
extract($new_arrays);
print_r($new_array1);
print_r($new_array2);
//...
I think in your case you have to create an array that holds all your generated arrays insight.
so first declare a variable before the loop.
$global_array = [];
insight the loop you can generate the name and fill that array.
$global_array["new_array$i"] = $val;
After the loop you can work with that array. But i think in the end that won't fix your memory limit problem. If fill 5 array with 200k entries it should be the same as filling one array of 200k the amount of data is the same. So it's possible that you run in both ways over the memory limit. If you can't define the limit it could be a problem.
ini_set('memory_limit', '-1');
So you can only prevent that problem in processing your values directly without saving something in an array. For example if you run a db query and process the values directly and save only the result.
You can try something like this:
foreach ($data as $key => $val) {
$new_array$i[] = $val;
unset($data[$key]);
}
Then your value is stored in a new array and you delete the value of the original data array. After 50k you have to create a new one.
Easier way use array_chunk to split your array into parts.
https://secure.php.net/manual/en/function.array-chunk.php
There's non need for multiple variables. If you want to process your data in chunks, so that you don't fill up memory, reuse the same variable. The previous contents of the variable will be garbage collected when you reassign it.
$chunk_size = 50000;
$number_of_chunks = ceil($data_size/$chunk_size);
for ($i = 0; $i < $data_size; $i += $chunk_size) {
$new_array = array();
foreach ($j = $i * $chunk_size; $j < min($j + chunk_size, $data_size); $j++) {
$new_array[] = get_data_item($j);
}
}
$new_array[$i] serves the same purpose as your proposed $new_array$i.
You could do something like this:
$required_number_of_arrays = ceil(count($data)/50000);
for ($i = 1;$i <= $required_number_of_arrays;$i++) {
$array_name = "new_array_$i";
$$array_name = [];
foreach ($data as $val) {
${$array_name}[] = $val;
}
}
I'm a beginner in PHP. I have a text file like this:
Name-Id-Number
Abid-01-80
Sakib-02-76
I can take the data as an array but unable to take it as an associative array. I want to do the following things:
Take the data as an associative array in PHP.
Search Number using ID.
Find out the total of Numbers
I believe I understand what you want, and it's fairly simple. First you need to read the file into a php array. That can be done with something like this:
$filedata = file($filename, FILE_IGNORE_NEW_LINES);
Now build your desired array using a foreach() loop, explode and standard array assignment. Your search requirement is unclear, but in this example, I make the associated array element into an array that is also an associative array with keys for 'id' and 'num'.
As you create the new array, you can compute your sum, as demonstrated.
<?php
$filedata = array('Abid-01-80', 'Sakib-02-76');
$lineArray = array();
$numTotal = 0;
foreach ($filedata as $line) {
$values = explode('-', $line);
$numTotal += $values[2];
$lineArray[$values[0]] = array('id' => $values[1], 'num' => $values[2]);
}
echo "Total: $numTotal\n\n";
var_dump($lineArray);
You can see this code demonstrated here
Updated response:
Keep in mind that notices are not errors. They are notifiying you that your code could be cleaner, but are typically suppressed in production.
The undefined variable notices are coming because you are using:
$var += $var without having initialized $var previously. Note that you were inconsistent in this practice. For example you initialized $numTotal, so you didn't get a notice when you used the same approach to increment it.
Simply add just below $numTotal = 0:
$count = 0;
$countEighty = 0;
Your other notices are occurring most likely due to a blank line or string in your input that does not follow the pattern expected. When explode is executed it is not returning an array with 3 elements, so when you try and reference $values = explode('-', $line); you need to make sure that $line is not an empty string before you process it. You could also add a sanity check like:
enter code hereif (count($values) === 3) { // It's ok to process
Is this the simplest way there is for getting rid of duplicate strlen items from an array?
I do alot of programming that do similar tasks as this, thats why Im asking, if Im doing it too complicated, or if this is the easiest way.
$usedlength = array();
$no_duplicate_filesizes_in_here = array();
foreach ($files as $file) {
foreach ($usedlength as $length) {
if (strlen($file) == $length) continue 2;
}
$usedlength[] = strlen($file);
$no_duplicate_filesizes_in_here[] = $file;
}
$files = $no_duplicate_filesizes_in_here;
There's not a lot hugely wrong with looping manually, though your example could be:
$files = array_intersect_key($files, array_unique(array_map('strlen', $files)));
PHP has a plethora of useful array functions available.
You can try this:
$no_duplicate_filesizes_in_here = array();
for ($i=count($files)-1;$i>=0;$i--){
$no_duplicate_filesizes_in_here[strlen($files[$i])] = $file;
}
$files = array_values($no_duplicate_filesizes_in_here);
// if you don't care about the keys, don't bother with array_values()
If you're using PHP 5.3 or above, array_filter provides a nice syntax for doing this:
$nodupes = array_filter($files, function($file) use (&$lengths) {
if (in_array(strlen($file), (array) $lengths)) {
return false;
}
$lengths[] = strlen($file);
return true;
});
Not as short as some other answers, but another aproach would be to use a key-based lookup:
$used = array();
$no_dupes = array();
foreach ($files as $file) {
if ( !array_key_exists(($length = strlen($file)), $used) ) {
$used[$length] = true;
$no_dupes[] = $file;
}
}
This would have the added bonus of not wasting time on storing duplicates (only to overwrite them later), however, whether this loop would be faster than some of PHP's built in array methods is probably down to a number of factors (number of duplicates, length of files array and so on) and would need to be tested. The above is what I would assume to be quicker in most cases, but I'm not a processor ;)
The above also means the first file found is the one that is kept, rather than the last found in some of the other approaches.
I have a PHP array that I'd like to duplicate but only copy elements from the array whose keys appear in another array.
Here are my arrays:
$data[123] = 'aaa';
$data[423] = 'bbb';
$data[543] = 'ccc';
$data[231] = 'ddd';
$data[642] = 'eee';
$data[643] = 'fff';
$data[712] = 'ggg';
$data[777] = 'hhh';
$keys_to_copy[] = '123';
$keys_to_copy[] = '231';
$keys_to_copy[] = '643';
$keys_to_copy[] = '712';
$keys_to_copy[] = '777';
$copied_data[123] = 'aaa';
$copied_data[231] = 'ddd';
$copied_data[643] = 'fff';
$copied_data[712] = 'ggg';
$copied_data[777] = 'hhh';
I could just loop through the data array like this:
foreach ($data as $key => $value) {
if ( in_array($key, $keys_to_copy)) {
$copied_data[$key] = $value;
}
}
But this will be happening inside a loop which is retrieving data from a MySQL result set. So it would be a loop nested within a MySQL data loop.
I normally try and avoid nested loops unless there's no way of using PHP's built-in array functions to get the result I'm looking for.
But I'm also weary of having a nested loop within a MySQL data loop, I don't want to keep MySQL hanging around.
I'm probably worrying about nested loop performance unnecessarily as I'll never be doing this for more than a couple of hundred rows of data and maybe 10 keys.
But I'd like to know if there's a way of doing this with built-in PHP functions.
I had a look at array_intesect_key() but that doesn't quite do it, because my $keys_to_copy array has my desired keys as array values rather than keys.
Anyone got any ideas?
Cheers, B
I worked it out - I almost had it above.I thought I'd post the answer anyway for completeness. Hope this helps someone out!
array_intersect_key($data, array_flip($keys_to_copy))
Use array_flip() to switch $keys_to_copy so it can be used within array_intersect_keys()
I'll run some tests to compare performance between the manual loop above, to this answer. I would expect the built-in functions to be faster but they might be pretty equal. I know arrays are heavily optimised so I'm sure it will be close.
EDIT:
I have run some benchmarks using PHP CLI to compare the foreach() code in my question with the code in my answer above. The results are quite astounding.
Here's the code I used to benchmark, which I think is valid:
<?php
ini_set('max_execution_time', 0);//NOT NEEDED FOR CLI
// BUILD RANDOM DATA ARRAY
$data = array();
while ( count($data) <= 200000) {
$data[rand(0, 500000)] = rand(0, 500000);
}
$keys_to_copy = array_rand($data, 100000);
// FOREACH
$timer_start = microtime(TRUE);
foreach ($data as $key => $value) {
if ( in_array($key, $keys_to_copy)) {
$copied_data[$key] = $value;
}
}
echo 'foreach: '.(microtime(TRUE) - $timer_start)."s\r\n";
// BUILT-IN ARRAY FUNCTIONS
$timer_start = microtime(TRUE);
$copied_data = array_intersect_key($data, array_flip($keys_to_copy));
echo 'built-in: '.(microtime(TRUE) - $timer_start)."s\r\n";
?>
And the results...
foreach: 662.217s
array_intersect_key: 0.099s
So it's much faster over loads of array elements to use the PHP array functions rather than foreach. I thought it would be faster but not by that much!
Why not load the entire result set into an array, then begin processing with nested loops?
$query_result = mysql_query($my_query) or die(mysql_error());
$query_rows = mysql_num_rows($query_result);
for ($i = 0; $i < $query_rows; $i++)
{
$row = mysql_fetch_assoc($query_result);
// 'key' is the name of the column containing the data key (123)
// 'value' is the name of the column containing the value (aaa)
$data[$row['key']] = $row['value'];
}
foreach ($data as $key => $value)
{
if ( in_array($key, $keys_to_copy))
{
$copied_data[$key] = $value;
}
}
I have a PHP script which reads a large CSV and performs certain actions, but only if the "username" field is unique. The CSV is used in more than one script, so changing the input from the CSV to only contain unique usernames is not an option.
The very basic program flow (which I'm wondering about) goes like this:
$allUsernames = array();
while($row = fgetcsv($fp)) {
$username = $row[0];
if (in_array($username, $allUsernames)) continue;
$allUsernames[] = $username;
// process this row
}
Since this CSV could actually be quite large, it's that in_array bit which has got me thinking. The most ideal situation when searching through an array for a member is if it is already sorted, so how would you build up an array from scratch, keeping it in order? Once it is in order, would there be a more efficient way to search it than using in_array(), considering that it probably doesn't know the array is sorted?
Not keeping the array in order, but how about this kind of optimization? I'm guessing isset() for an array key should be faster than in_array() search.
$allUsernames = array();
while($row = fgetcsv($fp)) {
$username = $row[0];
if (isset($allUsernames[$username])) {
continue;
} else {
$allUsernames[$username] = true;
// do stuff
}
}
The way to build up an array from scratch in sorted order is an insertion sort. In PHP-ish pseudocode:
$list = []
for ($element in $elems_to_insert) {
$index = binary_search($element, $list);
insert_into_list($element, $list, $index);
}
Although, it might actually turn out to be faster to just create the array in unsorted order and then use quicksort (PHP's builtin sort functions use quicksort)
And to find an element in a sorted list:
function binary_search($list, $element) {
$start = 0;
$end = count($list);
while ($end - $start > 1) {
$mid = ($start + $end) / 2;
if ($list[$mid] < $element){
$start = $mid;
}
else{
$end = $mid;
}
}
return $end;
}
With this implementation you'd have to test $list[$end] to see if it is the element you want, since if the element isn't in the array, this will find the point where it should be inserted. I did it that way so it'd be consistent with the previous code sample. If you want, you could check $list[$end] === $element in the function itself.
The array type in php is an ordered map (php array type). If you pass in either ints or strings as keys, you will have an ordered map...
Please review item #6 in the above link.
in_array() does not benefit from having a sorted array. PHP just walks along the whole array as if it were a linked list.