In a comma separated csv file, I want to count the number of rows only where the the first number is same.
Following is the example data, I want to get number of rows which start with 2 (2 rows), and the rows which start with 4 (3 rows). This is just an example, the numbers are random.:
2,0,0
2,1,0
4,0,0
4,3,0
4,4,0
I'm trying following code, I can count only all rows of the file but do not know how can I count only the rows which have same first number.
$i = 0;
while ($i < 5) { //fixed number of times
$i++;
$rows = 0;
$fp = fopen("test.csv", "r");
while (fgetcsv($fp)) { //don't want all rows
$rows++;
}
fclose($fp);
echo $rows;
}
Edit:
Sorry, I forgot to mention the numbers in above file are random, they are not always 2 or 4.
You will need an array to get statistics.
$i = 0;
$agg = []; // to count stats
while ($i < 5) {
$i++;
$rows = 0;
$fp = fopen("test.csv", "r");
while ($line = fgetcsv($fp)) {
$number = $line[0]; // get first number
if(isset($agg[$number])){ // if there is the number in stats
$agg[$number]++; // count new one
} else {
$agg[$number] = 1; // mark as found one
}
}
fclose($fp);
print_r($agg);
}
You could probably try something like this :
$i = 0;
while ($i < 5) { //fixed number of times
$i++;
$rows = array();
$fp = fopen("test.csv", "r");
while ($data = fgetcsv($fp)) {
$first = $data[0];
if(isset($rows[$first]) {
$rows[$first] += 1;
} else {
$rows[$first] = 1;
}
}
fclose($fp);
print_r($rows);
}
I didn't test my code, so it may contains errors.
After the execution, $rows will contain an associative array with the form 'first number' => 'count'
What I don't understand however is why you are doing it five times and also why you are not using a for loop for that instead of a while ?
Assuming you know how to get the contents of the text file, something like this should work:
$ar = explode("\n",$txtfileContents);
$counts = array();
foreach($ar as $row){
$counts[$row[0]]++;
}
$counts now contains an array of all first characters and how many times they appeared. You can now simply access them like this:
echo "'2' appeared ".$count['2']." times<br/>";
echo "'4' appeared ".$count['4']." times";
Step by step:
Split the text file into an array of individual rows:
$ar = explode("\n",$txtfileContents);
(Try alternatively file() here, which may be more reliable than explode)
Loop through it:
foreach($ar as $row){
}
Then check the first character inside the loop and count each character separately:
$counts[$row[0]]++;
Related
hi i have created watch points in this columns 1,2,3,4,5.....100 will come
Example: 1,2,4,5,34,56,100
from above 3 is missing first this number should return
$watchPoints = $videoWatchedData['watch_points'];
$fetArray = explode(",",$watchPoints); //unsorted 2,4,5,100,56,1,34
i want to sort the above one like this 1,2,4,5,34,56,100 and return first missing number.
What i have tried:
$sortFetchedArraysort = sort($fetArray ); //ksort,rosrt no one is working
$Expected = 1;
foreach ($sortFetchedArraysort as $Number){
if ($Expected != $Number) {
break;
}
$Expected++;
}
$percentageCount = $Number; // first missing number in my case output should return 3
exit;
Two problem i am facing one is sort not working second first missing number is not trturning.
Try this few code, check the live demo.
<?php
sort($array = explode(',', "10,1,2,4,5,6,25,36,75,100"));
print_r(current(array_diff(range(1, 100), $array)));
Hope this simple one, will be helpful for you. In your post you are sorting $fetArray but there is no need, you can check it like this.
<?php
ini_set('display_errors', 1);
$array=range(1,100);//your columns
//you should sort like this, but it is not at all required
$fetArray=array(2,4,5,100,56,1,34);
sort($fetArray);
//looping over array in which we are trying to find
foreach($array as $value)
{
//at the moment your that value is not present in array we will break from loop
if(!in_array($value, $fetArray))
{
break;
}
}
//at the moment we break from loop we will get the value which is not present
echo $value;
$watchPoints = "10,1,2,4,5,6,25,36,75,100";
$fetArray = explode(",", $watchPoints);
sort($fetArray);
for ($i = 0; $i < sizeof($fetArray); $i++) {
if ($fetArray[$i] != $i + 1) {
$missing = $i + 1;
break;
}
}
print($missing);
Example:
thisisline>and>1
thisisanother>line>something>13
just>another>line>143
short>11
I have this kind of data in my database on column 'profile'. What I need is to pick columns user_name and profile from the database where profile has the 5 highest values after the last occurance of the '>'
First you have to extract the numbers from your lines. Afterwards, order them and take the ones you want.
$lines = "thisisline>and>1
thisisanother>line>something>13
just>another>line>143
short>11";
$tmp = explode("\n", $lines); //separate lines
$numbers = array();
foreach($tmp as $line)
{
$numbers[] = substr($line, strrpos($line, ">") + 1); //extract number
}
rsort($numbers); //order your results
$count = 2; //define the count of results
$result = array_slice($numbers, 0, $count); //just take the ones which you want
var_dump($result);
You'll get your result as an array.
Working example here in a php sandbox.
Do like this.
Get all lines into an array, parse them, store the last value in an array, sort it revrse mode, and do what you want.
$lines = 'thisisline>and>1
thisisanother>line>something>13
just>another>line>143
short>11';
define('NUM_TO_GET', 2); //Get the two highest. Rewrite it to 5!
$array = explode("\n", $lines);
$results = array();
foreach ($array as $line) {
$tmp = explode('>', $line);
$results [] = $tmp[count($tmp) - 1];
}
rsort($results);
for ($i = 0; $i < NUM_TO_GET; $i++) {
if (isset($results[$i])) {
echo $results[$i] . "<br />";
} else {
break;
}
}
$theNumber = substr($string, strrpos($string, ">")+1);
This should give you the number for each line. Then, you just compare the numbers.
I'm importing a CSV that has 3 columns, one of these columns could have duplicate records.
I have 2 things to check:
1. The field 'NAME' is not null and is a string
2. The field 'ID' is unique
So far, I'm parsing the CSV file, once and checking that 1. (NAME is valid), which if it fails, it simply breaks out of the while loop and stops.
I guess the question is, how I'd check that ID is unique?
I have fields like the following:
NAME, ID,
Bob, 1,
Tom, 2,
James, 1,
Terry, 3,
Joe, 4,
This would output something like `Duplicate ID on line 3'
Thanks
P.S this CSV file has more columns and can have around 100,000 records. I have simplified it for a specific reason to solve the duplicate column/field
Thanks
<?php
$cnt = 0;
$arr=array();
if (($handle = fopen("1.csv", "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$num=count($data);
$cnt++;
for ($c=0; $c < $num; $c++) {
if(is_numeric($data[$c])){
if (array_key_exists($data[$c], $arr))
$arrdup[] = "duplicate value at ".($cnt-1);
else
$arr[$data[$c]] = $data[$c-1];
}
}
}
fclose($handle);
}
print_r($arrdup);
Give it a try:
$row = 1;
$totalIDs = array();
if (($handle = fopen('/tmp/test1.csv', "r")) !== FALSE)
{
while (($data = fgetcsv($handle)) !== FALSE)
{
$name = '';
if (isset($data[0]) && $data[0] != '')
{
$name = $data[0];
if (is_numeric($data[0]) || !is_string($data[0]))
echo "Name is not a string for row $row\n";
}
else
{
echo "Name not set for row $row\n";
}
$id = '';
if (isset($data[1]))
{
$id = $data[1];
}
else
{
echo "ID not set for row $row\n";
}
if (isset($totalIDs[$id])) {
echo "Duplicate ID on line $row\n";
}
else {
$totalIDs[$id] = 1;
}
$row++;
}
fclose($handle);
}
I went assuming a certain type of design, as stripped out the CSV part, but the idea will remain the same :
<?php
/* Let's make an array of 100,000 rows (Be careful, you might run into memory issues with this, issues you won't have with a CSV read line by line)*/
$arr = [];
for ($i = 0; $i < 100000; $i++)
$arr[] = [rand(0, 1000000), 'Hey'];
/* Now let's have fun */
$ids = [];
foreach ($arr as $line => $couple) {
if ($ids[$couple[0]])
echo "Id " . $couple[0] . " on line " . $line . " already used<br />";
else
$ids[$couple[0]] = true;
}
?>
100, 000 rows aren't that much, this will be enough. (It ran in 3 seconds at my place.)
EDIT: As pointed out, in_array is less efficient than key lookup. I've updated my code consequently.
Are the IDs sorted with possible duplicates in between or are they randomly distributed?
If they are sorted and there are no holes in the list (1,2,3,4 is OK; 1,3,4,7 is NOT OK) then just store the last ID you read and compare it with the current ID. If current is equal or less than last then it's a duplicate.
If the IDs are in random order then you'll have to store them in an array. You have multiple options here. If you have plenty of memory just store the ID as a key in a plain PHP array and check it:
$ids = array();
// ... read and parse CSV
if (isset($ids[$newId])) {
// you have a duplicate
} else {
$ids[$newId] = true; // new value, not a duplicate
}
PHP arrays are hash tables and have a very fast key lookup. Storing IDs as values and searching with in_array() will hurt performance a lot as the array grows.
If you have to save memory and you know the number of lines you going to read from the CSV you could use SplFixedArray instead of a plain PHP array. The duplicate check would be the same as above.
I am trying to parse a csv file into an array. Unfortunately one of the columns contains commas and quotes (Example below). Any suggestions how I can avoid breaking up the column in to multiple columns?
I have tried changing the deliminator in the fgetcsv function but that didn't work so I tried using str_replace to escape all the commas but that broke the script.
Example of CSV format
title,->link,->description,->id
Achillea,->http://www.example.com,->another,short example "Of the product",->346346
Seeds,->http://www.example.com,->"please see description for more info, thanks",->34643
Ageratum,->http://www.example.com,->this is, a brief description, of the product.,->213421
// Open the CSV
if (($handle = fopen($fileUrl, "r")) !==FALSE) {
// Set the parent array key to 0
$key = 0;
// While there is data available loop through unlimited times (0) using separator (,)
while (($data = fgetcsv($handle, 0, ",")) !==FALSE) {
// Count the total keys in each row
$c = count($data);
//Populate the array
for ($x = 0; $x < $c; $x++) {
$arrCSV[$key][$x] = $data[$x];
}
$key++;
} // end while
// Close the CSV file
fclose($handle);
}
Maybe you should think about using PHP's file()-function which reads you CSV-file into an array.
Depending on your delimiter you could use explode() then to split the lines into cells.
here an example:
$csv_file("test_file.csv");
foreach($csv_file as $line){
$cell = explode(",->", $line); // ==> if ",->" is your csv-delimiter!
$title[] = $cell[0];
$link[] = $cell[1];
$description = $cell[2];
$id[] = $cell[3];
}
I found this here PHP take all combinations
I modified it further to include n number of sets to the given array. However i am not able to figure out how to not get any duplicate numbers in a set. for example
If the output is
1 , 1, 3, 4
then it should remove the extra '1' and give it as
1,3,4
similarly if there are 2 outputs.
1,3,4,5 and 4,5,3,1
then it should remove one of the duplicate set as well.
I tried using array_unique and thought it could solve half of the issue however it gave a memory allocation error.
<?php
function permutations($arr,$n)
{
$res = array();
foreach ($arr as $w)
{
if ($n==1) $res[] = $w;
else
{
$perms = permutations($arr,$n-1);
foreach ($perms as $p)
{
$res[] = $w." ".$p;
}
}
}
return $res;
}
// Your array
$numbers = array(1,2,3,4,5,6,7);
// Get permutation by groups of n elements
for($i=1; $i<8; $i++)
$pe = permutations($numbers,$i);
$pe = array_unique($pe);
// Print it out
print_r($pe);
?>