php Duplicate content

php Duplicate content - php

I have file txt and in the file I have phone numbers.
I want to filter out that the duplicate numbers. How I could do it using PHP?
Each number is a new line /r/n

You could:
Parse the string into an array, via explode
Filter out the dups, via array_unique

$numbers = Array();
$numbers = file('mydata.txt');
$numbers = array_unique($numbers);

This should do:
$numbers = array_unique(file('phones.txt'));
print_r($numbers);
Used functions file() and array_unique().
Good luck!
Further explanation.
The file() will:
Returns the file in an array. Each
element of the array corresponds to a
line in the file...
So you can use to your advantage that the phones are one on each line.
Note:
Just in case I may clarify that this won't work if the .txt file actually has /r/n
123/r/n
456/r/n
123/r/n
789/r/n
More:
You can find this function file_get_contents() useful but it turns everything into a string NOT an array.

file(), reads in to array
array_unique() remove the duplicates
implode() recreate the per line format
file_put_contents() write the file back

Related

Is there an equivalent function in PHP for unix cut -f?

I have a long string of values separated by tabs and I wish to cut the data as if using unix cut -f. If I use cut -f5 it cuts all my data into a single column of the value which is in the 5th position. Is there a PHP function that can do the same?
Below is an example of the raw file with each word in the row separated by a tab
The result would be as follows if I ran cut -f2:

I guess the answer really is "no", as far as I know. But you can combine a few PHP functions to achieve the same result.
You can use file to read the lines from the file into an array
$rows = file($path_to_your_file);
Then convert that array of strings to a multidimensional array
$rows = array_map(function($row){
return str_getcsv($row, "\t");
}, $rows);
Then get the column you want from that array.
$column_5 = array_column($rows, 4);
Not as concise as cut -f5, but PHP rarely is for things like this.
Incidentally, if you don't care that your PHP program will only work on systems that have unix cut, you can actually just use the cut -f in shell_exec.

php array_search not returning values

I am putting the contents of an text file into an array via the file() command. When I try and search the array for a specific value it does not seem to return any value but when I look at the contents of the array the value I am searching for is there.
Code used for putting text into array:
$usernameFileHandle = fopen("passStuff/usernames.txt", "r+");
$usernameFileContent = file("passStuff/usernames.txt");
fclose($usernameFileHandle);
Code for searching the array
$inFileUsernameKey = array_search($username, $usernameFileContent);
Usernames.txt contains
Noah
Bob
Admin
And so does the $usernameFileContent Array. Why is array_search not working and is there a better way to do this. Please excuse my PHP noob-ness, thanks in advance.

Because file():
Returns the file in an array. Each element of the array corresponds to a line in the file, with the newline still attached
To prove this try the following:
var_dump(array_search('Bob
', $usernameFileContent));
You could use array_map() and trim() to correct the behavior of file(). Or, alternatively, use file_get_contents() and explode().

To quote the docs:
Each element of the array corresponds to a line in the file, with the newline still attached.
That means that when you're doing the search, you're searching for "Noah" in an array that contains "Noah\n" - which doesn't match.
To fix this, you should run trim() on each element of your array before you do the search.
You can do that using array_map() like this:
$usernameFileContent = array_map($usernameFileContent, 'trim');
Note, too, that the file() function operates directly on the provided filename, and does not need a file handle. That means you to do not need to use fopen() or fclose() - You can remove those two lines entirely.
So your final code could look like this:
$usernameFileContent = array_map(file('passStuff/usernames.txt'), 'trim');
$inFileUsernameKey = array_search($username, $usernameFileContent);

PHP - get_file_contents not working as an array?

I'm using a $file_contents = file_get_contents($file_name) then using $file_contents = array_splice($file_contents, 30, 7, 'changedText') to update something in the file code. However, this keeps resulting in:
Warning: array_splice(): The first argument should be an array
From what I understand the string returned by file_get_contents() should be able to be acted on like any other array. Any reason I'm having trouble with this? Thank you much!

From the manual:
file_get_contents — Reads entire file into a string
So you don't have an array. You have a string.

Read documentation.
String is not an array even when it supports using square brackets:
$str[0]
Use str_split function for behavior you want. It will convert your string into real array, and then you can use it as an argument in array_splice function. E.g:
echo('<pre>');
var_dump(array_slice(str_split("Stack Overflow"), 6));
echo('</pre>');
die();
I think it helps.

How To Get The Unique Name Count With PHP?

Let's say I have text file Data.txt with:
26||jim||1990
31||Tanya||1942
19||Bruce||1612
8||Jim||1994
12||Brian||1988
56||Susan||2201
and it keeps going.
It has many different names in column 2.
Please tell me, how do I get the count of unique names, and how many times each name appears in the file using PHP?
I have tried:
$counts = array_count_values($item[1]);
echo $counts;
after exploding ||, but it does not work.
The result should be like:
jim-2,
tanya-1,
and so on.
Thanks for any help...

Read in each line, explode using the delimiter (in this case ||), and add it to an array if it does not already exist. If it does, increment the count.
I won't write the code for you, but here a few pointers:
fread reads in a line
explode will split the line based on a delimiter
use in_array to check if the name has been found before, and to determine whether you need to add the name to the array or just increment the count.
Edit:
Following Jon's advice, you can make it even easier for you.
Read in line-by-line, explode by delimiter and dump all the names into an array (don't worry about checking if it already exists). After you're done, use array_count_values to get every unique name and its frequency.

Here's my take on this:
Use file to read the data file, producing an array where each element corresponds to a line in the input.
Use array_filter with trim as the filter function to remove blank lines from this array. This takes advantage that trim returns a string having removed whitespace from both ends of its argument, leaving the empty string if the argument was all whitespace to begin with. The empty string converts to boolean false -- thus making array_filter disregard lines that are all whitespace.
Use array_map with a callback that involves calling explode to split each array element (line of text) into three parts and returning the second of these. This will produce an array where each element is just a name.
Use array_map again with strtoupper as the callback to convert all names to uppercase so that "jim" and "JIM" will count as the same in the next step.
Finally, use array_count_values to get the count of occurrences for each name.
Code, taking things slowly:
function extract_name($line) {
// The -1 parameter (available as of PHP 5.1.0) makes explode return all elements
// but the last one. We want to do this so that the element we are interested in
// (the second) is actually the last in the returned array, enabling us to pull it
// out with end(). This might seem strange here, but see below.
$parts = explode('||', $line, -1);
return end($parts);
}
$lines = file('data.txt'); // #1
$lines = array_filter($lines, 'trim'); // #2
$names = array_map('extract_name', $lines); // #3
$names = array_map('strtoupper', $names); // #4
$counts = array_count_values($names); // #5
print_r($counts); // to see the results
There is a reason I chose to do this in steps where each steps involves a function call on the result of the previous step -- that it's actually possible to do it in just one line:
$counts = array_count_values(
array_map(function($line){return strtoupper(end(explode('||', $line, -1)));},
array_filter(file('data.txt'), 'trim')));
print_r($counts);
See it in action.
I should mention that this might not be the "best" way to solve the problem in the sense that if your input file is huge (in the ballpark of a few million lines) this approach will consume a lot of memory because it's reading all the input in memory at once. However, it's certainly convenient and unless you know that the input is going to be that large there's no point in making life harder.
Note: Senior-level PHP developers might have noticed that I 'm violating strict standards here by feeding the result of explode to a function that accepts its argument by reference. That's valid criticism, but in my defense I am trying to keep the code as short as possible. In production it would be indeed better to use $a = explode(...); return $a[1]; although there will be no difference as regards the result.

While I do feel that this website's purpose is to answer questions and not do homework assignments, I don't acknowledge the assumption that you are doing your homework, since that fact has not been provided. I personally learned how to program by example. We all learn our own ways, so here is what I would do if I were to attempt to answer your question as accurately as possible, based on the information you have provided.
<?php
$unique_name_count = 0;
$names = array();
$filename = 'Data.txt';
$pointer = fopen($filename,'r');
$contents = fread($pointer,filesize($filename));
fclose($pointer);
$lines = explode("\n",$contents);
foreach($lines as $line)
{
$split_str = explode('|',$line);
if(isset($split_str[2]))
{
$name = strtolower($split_str[2]);
if(!in_array($name,$names))
{
$names[] = $name;
$unique_name_count++;
}
}
}
echo $unique_name_count.' unique name'.(count($unique_name_count) == 1 ? '' : 's').' found in '.$filename."\n";
?>

Remove Duplicate ID's?

I have a list of 50,000 ID's in a flat file and need to remove any duplicate ID's. Is there any efficient/recommended algorithm for my problem?
Thanks.

You can use the command line sort program to order and filter the list of ids. This is a very efficient program and scales well too.
sort -u ids.txt > filteredIds.txt

Read into a dictionary line by line, discarding duplicates. When all read, write out to a new file.

I've did some experiments once and the fastest solution I could get in PHP was by sorting the items and manually remove all the duplicate items.
If performance isn't that much of an issue for you (which I suspect, 50,000 is not that much) than you can use array_unique(): http://php.net/array_unique

i guess if you have large enough memory allowance, you can put all these ids in array
$array[$id] = $id;
this would automatically weed out the dupes.

You can do:
file_put_contents($file,implode("\n",array_unique(file($file)));
How it works?
Read the file using function file
which returns an array.
Get rid of the duplicate lines using
array_unique
implode those unique lines with "\n"
to get a string
write the string back to the file
using file_put_contents
This solution assumes that you've got one ID per line in the flat file.

You can do it via array / array_unique, in this example i guess your ids are separated by line braks, if thats not the case just change it
$file = file_get_contents('/path/to/file.txt');
$array = explode("\n",$file);
$array = array_unique($array);
$file = implode("\n",$array);
file_put_contents('/path/to/file.txt',$file);

If you can just explode the contents of the file on a comma (or any delimiter), then array_unique will produce the least (and cleanest) code, otherwise if your are parsing the file going with the $array[$id] = $id is the fastest and cleanest solution.

If you can use a terminal (or native unix execution), the easiest way: (assuming that there is nothing else in the file):
sort < ids.txt | uniq > filteredIds.txt

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

php Duplicate content - php

I have file txt and in the file I have phone numbers. I want to filter out that the duplicate numbers. How I could do it using PHP? Each number is a new line /r/n

You could: Parse the string into an array, via explode Filter out the dups, via array_unique

$numbers = Array(); $numbers = file('mydata.txt'); $numbers = array_unique($numbers);

file(), reads in to array array_unique() remove the duplicates implode() recreate the per line format file_put_contents() write the file back

Related

Is there an equivalent function in PHP for unix cut -f?

php array_search not returning values

PHP - get_file_contents not working as an array?

How To Get The Unique Name Count With PHP?

Remove Duplicate ID's?

Categories

Resources