I have an application which needs to open the file, then find string in it, and print a line number where is string found.
For example, file example.txt contains few hashes:
APLF2J51 1a79a4d60de6718e8e5b326e338ae533 EEQJE2YX
66b375b08fc869632935c9e6a9c7f8da O87IGF8R
c458fb5edb84c54f4dc42804622aa0c5 APLF2J51 B7TSW1ZE
1e9eea56686511e9052e6578b56ae018 EEQJE2YX
affb23b07576b88d1e9fea50719fb3b7
So, I want to PHP search for "1e9eea56686511e9052e6578b56ae018" and print out its line number, in this case 4.
Please note that there are will not be multiple hashes in file.
I found a few codes over Internet, but none seem to work.
I tried this one:
<?PHP
$string = "1e9eea56686511e9052e6578b56ae018";
$data = file_get_contents("example.txt");
$data = explode("\n", $data);
for ($line = 0; $line < count($data); $line++) {
if (strpos($data[$line], $string) >= 0) {
die("String $string found at line number: $line");
}
}
?>
It just says that string is found at line 0.... Which is not correct....
Final application is much more complex than that...
After it founds line number, it should replace string which something else, and save changes to file, then goes further processing....
Thanks in advance :)
An ultra-basic solution could be:
$search = "1e9eea56686511e9052e6578b56ae018";
$lines = file('example.txt');
$line_number = false;
while (list($key, $line) = each($lines) and !$line_number) {
$line_number = (strpos($line, $search) !== FALSE) ? $key + 1 : $line_number;
}
echo $line_number;
A memory-saver version, for larger files:
$search = "1e9eea56686511e9052e6578b56ae018";
$line_number = false;
if ($handle = fopen("example.txt", "r")) {
$count = 0;
while (($line = fgets($handle, 4096)) !== FALSE and !$line_number) {
$count++;
$line_number = (strpos($line, $search) !== FALSE) ? $count : $line_number;
}
fclose($handle);
}
echo $line_number;
function get_line_from_hashes($file, $find){
$file_content = file_get_contents($file);
$lines = explode("\n", $file_content);
foreach($lines as $num => $line){
$pos = strpos($line, $find);
if($pos !== false)
return $num + 1
}
return false
}
get_line_from_hashes("arquivo.txt", "asdsadas2e3xe3ceQ#E"); //return some number or false case not found.
If you need fast and universal solution that working also for finding line number of multiline text in file, use this:
$file_content = file_get_contents('example.txt');
$content_before_string = strstr($file_content, $string, true);
if (false !== $content_before_string) {
$line = count(explode(PHP_EOL, $content_before_string));
die("String $string found at line number: $line");
}
FYI Works only with PHP 5.3.0+.
$pattern = '/1e9eea56686511e9052e6578b56ae018/';
if (preg_match($pattern, $content, $matches, PREG_OFFSET_CAPTURE)) {
//PREG_OFFSET_CAPTURE will add offset of the found string to the array of matches
//now get a substring of the offset length and explode it by \n
$lineNumber = count(explode("\n", substr($content, 0, $matches[0][1])));
}
If the file is not extremely large then just read the file into an array file, search for the word preg_grep, get the index key for that line and add 1 since the array starts at 0:
$string = "1e9eea56686511e9052e6578b56ae018";
echo key(preg_grep("/$string/", file("example.txt"))) + 1;
I found this to work great and be very efficient; Simply explode the file by each line and search through the array for your search terms like so:
function getLineNum($haystack, $needle){
# Our Count
$c = 1;
# Turn our file contents/haystack into an array
$hsarr = explode("\n", $haystack);
# Iterate through each value in the array as $str
foreach($hsarr as $str){
# If the current line contains our needle/hash we are looking for it
# returns the current count.
if(strstr($str, $needle)) return $c;
# If not, Keep adding one for every new line.
$c++;
}
# If nothing is found
if($c >= count($hsarr)) return 'No hash found!';
}
EDIT: Looking through the other answers, I realize that Guilherme Soares had a similar approach but used strpos, which in this case doesnt work. So I made a few alterations with his idea in mind here:
function getLineNum($haystack, $needle){
$hsarr = explode(PHP_EOL, $haystack);
foreach($hsarr as $num => $str) if(strstr($str, $needle)) return $num + 1;
return 'No hash found!';
}
Live Demo: https://ideone.com/J4ftV3
Related
I have a problem where I need to search a HTML page/snippet and replace any value that is between four percentile symbols and convert to a constant variable, e.g. %%THIS_CONSTANT%% becomes THIS_CONSTANT.
Right now I am searching through the page, line by line, and I am able to find matches and replace them by using preg_match_all and preg_replace.
$file_scan = fopen($directory.$file, "r");
if ($file_scan) {
while (($line = fgets($file_scan)) !== false) {
if(preg_match_all('/\%%(.*?)\%%/', $line, $matches)){
foreach($matches as $match){
foreach($match as $m){
$repair = preg_replace('/\%%(.*?)\%%/', $m, $m);
if(preg_match('/\%%(.*?)\%%/', $m, $m)){
} else {
echo $repair.' '.$j;
$j++;
}
}
$lines[$i] = preg_replace('/\%%(.*?)\%%/', constant($repair), $line);
}
} else {
$lines[$i] = $line;
}
$i++;
}
$template[$name] = implode("", $lines);
fclose($file_scan);
}
What this code is not able to do is find and replace multiple matches on a single line. For instance, if there is a line with:
<img src="%%LOGO_IMAGE%%"><h1>%%TITLE%%</h1>
The above code would replace both items with the same value (TITLE). It would also give the error couldn't find constant on the first loop, but work correctly on the second.
This happens very rarely, but I just wish to know how to modify multiple instances on a single line just to be safe.
Edit:
I am able to replace the majority of the code with this:
$file_scan = fopen($directory.$file, "r");
if ($file_scan) {
while (($line = fgets($file_scan)) !== false) {
$line = preg_replace('/\%%(.*?)\%%/', '$2'.'$1', $line);
echo $line;
}
fclose($file_scan);
My last issue is changing the replaced items to constants. Is that possible?
Final Edit:
With the help from Peter Bowers suggestion, I used preg_replace_callback to add the ability to change the keyword to a constant:
foreach($filenames as $file){
$name = str_replace('.html', '', $file);
$template[$name] = preg_replace_callback('/\%%(.*?)\%%/', function($matches){
$matches[0] = preg_replace('/\%%(.*?)\%%/', '$1', $matches[0]);
return constant($matches[0]);
}, file_get_contents($directory.$file));
}
return $template;
Here's a much simpler implementation.
$file_scan = fopen($directory.$file, "r");
if ($file_scan) {
$out = '';
while (($line = fgets($file_scan)) !== false) {
$out .= preg_replace('/\%%(.*?)\%%/', '$1', $line);
$i++;
}
$template[$name] = $out;
fclose($file_scan);
}
Or, even simpler:
$str = file_get_contents($directory.$file);
$template[$name] = preg_replace('/\%%(.*?)\%%/', '$1', $str);
And, since we're going totally simple here...
$template[$name] = preg_replace('/\%%(.*?)\%%/', '$1', file_get_contents($directory.$file));
(Obviously you are losing some of your error checking capabilities as we approach the one-liner, but - hey - I was having fun... :-)
Try with this:
<?php
define('TITLE', 'Title');
define('LOGO_IMAGE', 'Image');
$lines = array();
$file_scan = fopen($directory.$file, "r");
if ($file_scan) {
while (($line = fgets($file_scan)) !== false) {
if(preg_match_all('/\%%(.*?)\%%/', $line, $matches)){
for($i = 0; $i < count($matches[0]); $i++) {
$line = str_replace($matches[0][$i], constant($matches[1][$i]), $line);
}
$lines[] = $line;
print_r($line);
}
}
}
$template[$name] = implode("", $lines);
fclose($file_scan);
?>
I'm new at PHP so I'm need help to build this script.
I have a file.txt file with following lines:
aaaa 1234
bbba 1234
aaaa 1236
cccc 1234
aaaa 1238
dddd 1234
I want to find the line with string "aaaa" and print:
String "aaaa" found 3 times at lines: 1, 3, 5.
And better it can print these lines.
I tried this code:
<?
function find_line_number_by_string($filename, $search, $case_sensitive=false ) {
$line_number = '';
if ($file_handler = fopen($filename, "r")) {
$i = 0;
while ($line = fgets($file_handler)) {
$i++;
//case sensitive is false by default
if($case_sensitive == false) {
$search = strtolower($search); //convert file and search string
$line = strtolower($line); //to lowercase
}
//find the string and store it in an array
if(strpos($line, $search) !== false){
$line_number .= $i.",";
}
}
fclose($file_handler);
}else{
return "File not exists, Please check the file path or filename";
}
//if no match found
if(count($line_number)){
return substr($line_number, 0, -1);
}else{
return "No match found";
}
}
$output = find_line_number_by_string('file.txt', 'aaaa');
print "String(s) found in ".$output;
?>
But I dont know how to count total of strings found (3) and print each found line.
Thank in advance.
There are lots of ways to do this that produce the same final result but differ in the specifics.
Assuming that your input is not large enough that you are concerned about loading it in memory all at once, one of the most convenient approaches is to use file to read the file's contents into an array of lines, then preg_grep to filter the array and only keep the matching lines. The resulting array's keys will be line numbers and the values will be whole lines that matched, perfectly fitting your requirements.
Example:
$lines = file('file.txt');
$matches = preg_grep('/aaaa/', $lines);
echo count($matches)." matches found.\n";
foreach ($matches as $line => $contents) {
echo "Line ".($line + 1).": ".$contents."\n";
}
$str = "aaaa";
$handle = fopen("your_file.txt", "r");
if ($handle) {
echo "String '".$str."' found at lines : ";
$count = 0;
$arr_lines = array();
while (($line = fgets($handle)) !== false) {
$count+=1;
if (strpos($line, $str) !== false) {
$arr_lines[] = $count;
}
}
echo implode(", ", $arr_lines).".";
}
UPDATE 2 :
$file = "your_file.txt";
$str = "aaaa;";
$arr = count_line_no($file, $str);
if(count($arr)>0)
{
echo "String '".$str."' found at lines : ".implode(", ", $arr).".";;
}
else
{
echo "String '".$str."' not found in file ";
}
function count_line_no($file, $str)
{
$arr_lines = array();
$handle = fopen("your_file.txt", "r");
if ($handle) {
$count = 0;
$arr_lines = array();
while (($line = fgets($handle)) !== false) {
$count+=1;
if (strpos($line, $str) !== false) {
$arr_lines[] = $count;
}
}
}
return $arr_lines;
}
**Try it for solve your problam **
if(file_exists("file.txt")) // check file is exists
{
$f = fopen("file.txt", "r");
// Read line by line until end of file
$row_count = 0;
while(!feof($f))
{
$row_count += 1;
$row_data = fgets($f);
$findme = 'aaaa';
$pos = strpos($row_data, $findme);
if ($pos !== false)
{
echo "The string '$findme' was found in the string '$row_data'";
echo "<br> and line number is".$row_data;
}
else
{
echo "The string '$findme' was not found ";
}
}
fclose($f);
}
I'm stuck. Ignore the top line, no use for date yet, but will be using in a bit. Having issues primarily with this line:
if(strpos($line, $extension) !== false and (preg_match('#\d#',$line !== false))){
I'm trying to do is that if a domain name ($line) is a .com and has no numbers then echo it. All of the preg_replace and strlen seems to be working, but I can't get it to only perform the way I need. I need to put the preg_match outside of the <=40 rule as it may be causing confusion?
<?php
date_default_timezone_set('UTC');
$extension = '.com';
$lines = file('PoolDeletingDomainsList.txt');
echo "<b>4 Letter premiums for ". date("n/j/Y") .":</b><br />";
foreach($lines as $line)
if(strlen($line)<=40) {
{
// Check if the line contains the string we're looking for, and print if it does
if(strpos($line, $extension) !== false and (preg_match('#\d#',$line !== false))){
$line = preg_replace('/12:00:00 AM,AUC\b/','<br />', $line);
$line = preg_replace('/,9\/28\/2013/', '', $line);
echo $line;
}
}
}
?>
Return Values
preg_match() returns 1 if the pattern matches given subject, 0 if it does not, or FALSE if an error occurred.
Manual preg_match
if(strpos($line, $extension) !== false and (preg_match('#\d#',$line) !== false))){
$line = preg_replace('/12:00:00 AM,AUC\b/','<br />', $line);
$line = preg_replace('/,9\/28\/2013/', '', $line);
echo $line;
}
replace with
if ((false !== strpos($line, $extension)) && (1 === preg_match('#\d#',$line))){
$line = preg_replace('/12:00:00 AM,AUC\b/','<br />', $line);
$line = preg_replace('/,9\/28\/2013/', '', $line);
echo $line;
}
This will check if $line contains .com and has numbers (otherwise those preg_replace would have nothing to work with).
Here is what seemed to work for me.
date_default_timezone_set('UTC');
$extension = '.com';
$lines = file('PoolDeletingDomainsList.txt');
echo "<b>4 Letter premiums for ". date("n/j/Y") .":</b><br />";
foreach($lines as $line)
if(strlen($line)<=36) {
{
// Check if the line contains the string we're looking for, and print if it does
$line = preg_replace('/12:00:00 AM,AUC\b/','<br />', $line);
$line = preg_replace('/,9\/28\/2013/', '', $line);
if ((false !== strpos($line, $extension)) && (0 === preg_match('#\d#',$line)) && (0 === preg_match('/-/', $line))){
echo $line;
}
}
}
?>
According to the preg_match documentation:
preg_match() returns 1 if the pattern matches given subject, 0`` if it does not, or FALSE if an error occurred.
So, according to your current condition, the if statement will evaluate to TRUE if preg_match returns any value that's not FALSE (which includes 1 and 0). And preg_match returns 1 if a match is found, so all your domains will pass the condition and echoed.
To fix the error, change your if statement to:
if(strpos($line, $extension) !== false && !preg_match('#\d#',$line)) {
So you're searching for .com domains that have no numbers in it ("premium domains").
<?php
$lines = array(
'example.com',
'exa13mple.com',
'domain.org',
'google.com',
'37signals.com'
);
foreach ($lines as $line)
{
$matches = array();
$isComDomain = preg_match('/\w+\\.com/', $line, $matches);
$hasNoNumbers = !empty($matches) ? preg_match('/^[a-zA-Z]+\\.com$/', $matches[0]) : false;
if ($isComDomain && $hasNoNumbers) {
print $matches[0] . "\n";
}
}
The isComDomain is a boolean telling if it found a [word characters].com from the line. And if it found, it stores the found domain name in $matches[0].
Then the hasNoNumbers is a boolean telling if the .com domain name contained only chars from a-z and A-Z. You may wish to include "-" in the regex if you allow dashes.
How to count specific lines in a text file depending on a particular variable in that line.
For example i need to count the lines of a text file only containing for instance $item1 or $item2 etc.
Sounds like you need something like what grep -c do in the shell, try something like this:
$item1 = 'match me';
$item2 = 'match me too';
// Thanks to #Baba for the suggestion:
$match_count = count(
preg_grep(
'/'.preg_quote($item1).'|'.preg_quote($item2).'/i',
file('somefile_input.txt', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES)
)
);
// does the same without creating a second array with the matches
$match_count = array_reduce(
file('somefile_input.txt', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES),
function($match_count, $line) use ($item1, $item2) {
return
preg_match('/'.preg_quote($item1).'|'.preg_quote($item2).'/i', $line) ?
$match_count + 1 : $match_count;
}
);
The above code sample uses the file() function to read the file into an array (splitted by lines), array_reduce() to iterate that array and preg_match() inside the iteration to see if a line matched (the /i at the end makes it case-insensitive).
You could use a foreach as well too.
This code reads file.php and counts only lines containing '$item1' or '$item2'. The check itself could be finetuned, since you have to add a new stristr() for every word you want to check.
<?php
$file = 'file.php';
$fp = fopen($file, 'r');
$size = filesize($file);
$content = fread($fp, $size);
$lines = preg_split('/\n/', $content);
$count = 0;
foreach($lines as $line) {
if(stristr($line, '$item1') || stristr($line, '$item2')) {
$count++;
}
}
echo $count;
Read your file line by line and use strpos to determine if a line contains a specific string/item.
$handle = fopen ("filename", "r");
$counter = 0;
while (!feof($handle))
{
$line = fgets($handle);
// or $item2, $item3, etc.
$pos = strpos($line, $item);
if ($pos !== false)
{
$counter++
}
}
fclose ($handle);
I am curious if you have a string how would you detect the delimiter?
We know php can split a string up with explode() which requires a delimiter parameter.
But what about a method to detect the delimiter before sending it to explode function?
Right now I am just outputting the string to the user and they enter the delimiter. That's fine -- but I am looking for the application to pattern recognize for me.
Should I look to regular expressions for this type of pattern recognition in a string?
EDIT: I have failed to initially specify that there is a likely expected set of delimiters. Any delimiter that is probably used in a CSV. So technically anyone could use any character to delimit a CSV file but it is more probable to use one of the following characters: comma, semicolon, vertical bar and a space.
EDIT 2: Here is the workable solution I came up with for a "determined delimiter".
$get_images = "86236058.jpg 86236134.jpg 86236134.jpg";
//Detection of delimiter of image filenames.
$probable_delimiters = array(",", " ", "|", ";");
$delimiter_count_array = array();
foreach ($probable_delimiters as $probable_delimiter) {
$probable_delimiter_count = substr_count($get_images, $probable_delimiter);
$delimiter_count_array[$probable_delimiter] = $probable_delimiter_count;
}
$max_value = max($delimiter_count_array);
$determined_delimiter_array = array_keys($delimiter_count_array, max($delimiter_count_array));
while( $element = each( $determined_delimiter_array ) ){
$determined_delimiter_count = $element['key'];
$determined_delimiter = $element['value'];
}
$images = explode("{$determined_delimiter}", $get_images);
Determine which delimiters you consider probable (like ,, ; and |) and for each search how often they occur in the string (substr_count). Then choose the one with most occurrences as the delimiter and explode.
Even though that might not be fail-safe it should work in most cases ;)
I would say this works 99.99% of the cases :)
The basic idea is, that number of valid delimiters should be the same line by line.
This script calculates delimiter count discrepancies between all lines.
Less discrepancy means more likely valid delimiter.
Putting it all together this function read rows and return it back as an array:
function readCSV($fileName)
{
//detect these delimeters
$delA = array(";", ",", "|", "\t");
$linesA = array();
$resultA = array();
$maxLines = 20; //maximum lines to parse for detection, this can be higher for more precision
$lines = count(file($fileName));
if ($lines < $maxLines) {//if lines are less than the given maximum
$maxLines = $lines;
}
//load lines
foreach ($delA as $key => $del) {
$rowNum = 0;
if (($handle = fopen($fileName, "r")) !== false) {
$linesA[$key] = array();
while ((($data = fgetcsv($handle, 1000, $del)) !== false) && ($rowNum < $maxLines)) {
$linesA[$key][] = count($data);
$rowNum++;
}
fclose($handle);
}
}
//count rows delimiter number discrepancy from each other
foreach ($delA as $key => $del) {
echo 'try for key=' . $key . ' delimeter=' . $del;
$discr = 0;
foreach ($linesA[$key] as $actNum) {
if ($actNum == 1) {
$resultA[$key] = 65535; //there is only one column with this delimeter in this line, so this is not our delimiter, set this discrepancy to high
break;
}
foreach ($linesA[$key] as $actNum2) {
$discr += abs($actNum - $actNum2);
}
//if its the real delimeter this result should the nearest to 0
//because in the ideal (errorless) case all lines have same column number
$resultA[$key] = $discr;
}
}
var_dump($resultA);
//select the discrepancy nearest to 0, this would be our delimiter
$delRes = 65535;
foreach ($resultA as $key => $res) {
if ($res < $delRes) {
$delRes = $res;
$delKey = $key;
}
}
$delimeter = $delA[$delKey];
echo '$delimeter=' . $delimeter;
//get rows
$row = 0;
$rowsA = array();
if (($handle = fopen($fileName, "r")) !== false) {
while (($data = fgetcsv($handle, 1000, $delimeter)) !== false) {
$rowsA[$row] = Array();
$num = count($data);
for ($c = 0; $c < $num; $c++) {
$rowsA[$row][] = trim($data[$c]);
}
$row++;
}
fclose($handle);
}
return $rowsA;
}
I have the same problem, I am dealing with a lot of CSV's from various databases, which various people extract to CSV in various ways, sometimes different each time for the same dataset ... Have simply implemented a function like this in my convert base class
protected function detectDelimiter() {
$handle = #fopen($this->CSVFile, "r");
if ($handle) {
$line=fgets($handle, 4096);
fclose($handle);
$test=explode(',', $line);
if (count($test)>1) return ',';
$test=explode(';', $line);
if (count($test)>1) return ';';
//.. and so on
}
//return default delimiter
return $this->delimiter;
}
I made something like this:
$line = fgetcsv($handle, 1000, "|");
if (isset($line[1]))
{
echo "delimiter is: |";
$delimiter="|";
}
else
{
$line1 = fgetcsv($handle, 1000, ";");
if (isset($line1[1]))
{
echo "delimiter is: ;";
$delimiter=";";
}
else
{
echo "delimiter is: ,";
$delimiter=",";
}
}
This simply checks whether there is a second column after a line is read.
I am having the same issue. My system will recieve CSV files from the client but it could use ";", "," or " " as delimiter and I wnat to improve the system so the client dont have to know which is (They never do).
I search and found this library:
https://github.com/parsecsv/parsecsv-for-php
Very good and easy to use.