PHP - get line number of regex result

PHP - get line number of regex result - php

I'm trying to write some PHP that will read a CSS file, find all occurrences of the #group comment, and their line number. This is what I have so far, but it's returning the character count rather than the line number.
$file = 'master.css';
$string = file_get_contents($file);
$matches = array();
preg_match_all('/\/\* #group.*?\*\//m', $string, $matches, PREG_OFFSET_CAPTURE);
list($capture, $offset) = $matches[0];
$line_number = substr_count(substr($string, 0, $offset), "\n") + 1;
echo '<pre>';
print_r($matches[0]);
echo '</pre>';

Try using file() rather than file_get_contents(). The difference is that file() returns the file contents as an array, one element per line, rather than as a string like file_get_contents does. I should note that file() returns the newline character at the end of each line as part of the array element. If you don't want that, add the FILE_IGNORE_NEW_LINES flag as a second parameter.
From there, you can use preg_grep() to return only the elements in the initial array. You can read their indexes to determine which lines matched, if you only want the line numbers:
An example:
myfile.txt:
hello world
how are you
say hello back!
line_find.php:
$filename = "myfile.txt";
$fileContents = file($filename);
$pattern = "/hello/";
$linesFound = preg_grep($pattern, $fileContents);
echo "<pre>", print_r($linesFound, true), "</pre>";
Result:
Array
(
[0] => hello world
[2] => say hello back!
)
Hope that helps.

This is not going to be optimal, but if you don't care about that :
$line_number = 1 + substr_count($string, "\n", 0, $index);
It's just counting the number of new line characters found up until that index you get from the offset capture.

Related

Printing all occurences of a string in a file using regular expressions using php

i am writing a code that will print all instances of a string from a file of about 500 pages.
This is some of the code:
$file = "serialnumbers.txt";
$file_open = fopen($file, 'r');
$string = "\$txtserial";
$read = fread($file_open,'8000000');
$match_string = preg_match('/^$txtserial/', $read, $matches[]=null);
for($i = 0; sizeof($matches) > $i; $i++)
{
echo "<li>$matches[$i]</li>";
}
All of the serial numbers start with "$txtserial" followed by about 10 numerical characters, some of them separated by comma(,). Example: $txtserial0840847276,8732569089.
I am actually looking for a way to print every instances of the $txtserial with the following numerical characters excluding the comma(,). Though I have used regular expressions but if there is any other method to employ i will also be grateful. I just want to get this done in the quickest possible time

You have a problem here where you are trying to create a regex using string variable:
$match_string = preg_match('/^$txtserial/', $read, $matches[]=null);
You can use:
$match_string = preg_match('/^' . preg_quote($txtserial) . '/', $read, $matches);

Try this example using preg_match_all() function:
$txtserial = 'MSHKK';
$read = 'MSHKK1231231231,23
MSHKK1231231
txtserial123123109112
MSHKK1231231111,123123123';
$match_string = preg_match_all('/(?:^|(?<=[\n\r]))('.preg_quote($txtserial).'\d{10})\b/', $read, $matches);
print_r($matches[1]);
Output:
[0] => MSHKK1231231231
[1] => MSHKK1231231111
It is basically picking the portions which are starts with the value that $txtserial holds and followed by 10 digits.

Separate each word into a Array

I have a file with contents like :
Apple 100
banana 200
Cat 300
I want to search for a particular string in the file and get the next word. Eg: I search for cat, I get 300. I have looked up this solution: How to Find Next String After the Needle Using Strpos(), but that didn't help and I didn't get the expected output. I would be glad if you can suggest any method without using regex.

I'm not sure this is the best approach, but with the data you've provided, it'll work.
Get the contents of the file with fopen()
Separate the values into array elements with explode()
Iterate over your array and check each element's index as odd or even. Copy to new array.
Not perfect, but on the right track.
<?php
$filename = 'data.txt'; // Let's assume this is the file you mentioned
$handle = fopen($filename, 'r');
$contents = fread($handle, filesize($filename));
$clean = trim(preg_replace('/\s+/', ' ', $contents));
$flat_elems = explode(' ', $clean);
$ii = count($flat_elems);
for ($i = 0; $i < $ii; $i++) {
if ($i%2<1) $multi[$flat_elems[$i]] = $flat_elems[$i+1];
}
print_r($multi);
This outputs a multidimensional array like this:
Array
(
[Apple] => 100
[banana] => 200
[Cat] => 300
)

Try this, it doesn't use regex, but it will be inefficient if the string you're searching is longer:
function get_next_word($string, $preceding_word)
{
// Turns the string into an array by splitting on spaces
$words_as_array = explode(' ', $string);
// Search the array of words for the word before the word we want to return
if (($position = array_search($preceding_word, $words_as_array)) !== FALSE)
return $words_as_array[$position + 1]; // Returns the next word
else
return false; // Could not find word
}

$find = 'Apple';
preg_match_all('/' . $find . '\s(\d+)/', $content, $matches);
print_r($matches);

You might benefit from using named regex subpatterns to capture the information you're looking for.
For example you, finding a number the word that is its former (1 <= value <= 9999)
/*String to search*/
$str = "cat 300";
/*String to find*/
$find = "cat";
/*Search for value*/
preg_match("/^$find+\s*+(?P<value>[0-9]{1,4})$/", $str, $r);
/*Print results*/
print_r($r);
In cases where a match is found the results array will contain the number you're looking for indexed as 'value'.
This approach can be combined with
file_get_contents($file);

In PHP, if I find a word in a file, can I make the line that the word came from into a $string

I want to find a word in a large list file.
Then, if and when that word is found, take the whole line of the list file that the word was found in?
so far I have not seen any PHP string functions to do this

Use a line-delimited regular expression to find the word, then your match will contain the whole line.
Something like:
preg_match('^.*WORD.*$, $filecontents, $matches);
Then $matches will have the full lines of the places it found WORD

You could use preg_match:
$arr = array();
preg_match("/^.*yourSearch.*$/", $fileContents, $arr);
$arr will then contain the matches.

$path = "/path/to/wordlist.txt";
$word = "Word";
$handle = fopen($path,'r');
$currentline = 1; //in case you want to know which line you got it from
while(!feof($handle))
{
$line = fgets($handle);
if(strpos($line,$word))
{
$lines[$currentline] = $line;
}
$currentline++;
}
fclose($handle);
If you want to only find a single line where the word occurs, then instead of saving it to an array, save it somewhere and just break after the match is made.
This should work quickly on files of any size (using file() on large files probably isn't good)

Try this one:
$searhString = "search";
$result = preg_grep("/^.*{$searhString}.*$/", file('/path/to/your/file.txt'));
print_r($result);
Explanation:
file() will read your file and produces array of lines
preg_grep() will return array element in which matching pattern is found
$result is the resulting array.

regex split string at first line break

I would like to split a string at the first line break, instead of the first blank line
'/^(.*?)\r?\n\r?\n(.*)/s' (first blank line)
So for instance, if I have:
$str = '2099 test\nAre you sure you
want to continue\n some other string
here...';
match[1] = '2099 test'
match[2] = 'Are you sure you want to continue\n some other string here...'

preg_split() has a limit parameter you can take to your advantage. You could just simply do:
$lines = preg_split('/\r\n|\r|\n/', $str, 2);

<?php
$str = "2099 test\nAre you sure you want to continue\n some other string here...";
$match = explode("\n",$str, 2);
print_r($match);
?>
returns
Array
(
[0] => 2099 test
[1] => Are you sure you want to continue
some other string here...
)
explode's last parameter is the number of elements you want to split the string into.

Normally just remove on \r?\n:
'/^(.*?)\r?\n(.*)/s'

You can use preg_split as:
$arr = preg_split("/\r?\n/",$str,2);
See it on Ideone

First line break:
$match = preg_split('/\R/', $str, 2);
First blank line:
$match = preg_split('/\R\R/', $str, 2);
Handles all the various ways of doing line breaks.
Also there was a question about splitting on the 2nd line break. Here is my implementation (maybe not most efficient... also note it replaces some line breaks with PHP_EOL)
function split_at_nth_line_break($str, $n = 1) {
$match = preg_split('/\R/', $str, $n+1);
if (count($match) === $n+1) {
$rest = array_pop($match);
}
$match = array(implode(PHP_EOL, $match));
if (isset($rest)) {
$match[] = $rest;
}
return $match;
}
$match = split_at_nth_line_break($str, 2);

Maybe you don't even need to use regex's. To get just split lines, see:
What's the simplest way to return the first line of a multi-line string in Perl?

extracting multiple fields from a text file using php

what is the best way of extracting multiple (~40 values) from a text file using php?
the data is more or less like:
NAMEA valuea
NAMEB valueb
I'm looking for a proper* approach to extracting this data into a data-structure, because i will need to specify regexs for all of them (all 40).
did i make myself clear?
*meaning, the default/painful method would be for me to do:
$namea = extractfunction("regexa", $textfilevalue);
$nameb = extractfunction("regeb", $textfilevalue);
... 40 times!
The lines may not be in the same order, or be present in each file. Every NAMEA is text like: "Registration Number:", or "Applicant Name:" (ie, with spaces in what i was calling as NAMEA)
Response to the Col.
i'm looking for a sensible "way" of writing my code, so its readable, modifiable, builds an object/array thats easily callable, etc... "good coding style!" :)
#Adam - They do actually... and contain slashes as well...
#Alix - Freaking marvelous man! THat was GOOD! would you also happen to have any insights on how I can "truncate" the rsultant array by removing everything from "key_x" and beyond? Should i open that as a new question?

Here is my take at it:
somefile.txt:
NAMEA valuea
NAMEB valueb
PHP Code:
$file = file_get_contents('./somefile.txt');
$string = preg_replace('~^(.+?)\s+(.+?)$~m', '$1=$2', $file);
$string = str_replace(array("\r\n", "\r", "\n"), '&', $string);
$result = array();
parse_str($string, $result);
echo '<pre>';
print_r($result);
echo '</pre>';
Output:
Array
(
[NAMEA] => valuea
[NAMEB] => valueb
)
You may also be able to further simplify this by using str_getcsv() on PHP 5.3+.
EDIT: My previous version fails for keys that have spaces like #Col. Shrapnel noticed. I didn't read the question with enough attention. A possible solution since you seem to be using keys that always have : appended is this:
$string = preg_replace('~^(.+?):\s+(.+?)$~m', '$1=$2', $file);
To remove everything from key_x to the end of the file you can do something like this:
$string = substr($string, 0, strpos($string, 'key_x'));
So the whole thing would look like this:
somefile.txt:
Registration Number: valuea
Applicant Name: valueb
PHP Code:
$file = file_get_contents('./somefile.txt');
$string = substr($file, 0, strpos($file, 'key_x'));
$string = preg_replace('~^(.+?):\s+(.+?)$~m', '$1=$2', $string);
$string = str_replace(array("\r\n", "\r", "\n"), '&', $string);
$result = array();
parse_str($string, $result);
echo '<pre>';
print_r($result);
echo '</pre>';
Output:
Array
(
[Registration_Number] => valuea
[Applicant_Name] => valueb
)

as far as I get it you can use file() to get an array of strings and then parse these strings with some regexp.
if you add a = sign between names and values, you'll be ble to get the whole thing at once using parse_ini_file()

Assuming your keys (namea, nameb) never have spaces in them:
$contents = file('some_file.txt'); // read file as array
$data = array();
foreach($contents as $line) { // iterate over file
preg_match('/^([^\s]+)\s+(.*)/', $line, $matches); // pull out key and value into $matches
$key = $matches[1];
$value = $matches[2];
$data[$key] = $value; // store key/value pairs in $data array
}
var_dump($data); // what did we get?

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP - get line number of regex result - php

This is not going to be optimal, but if you don't care about that : $line_number = 1 + substr_count($string, "\n", 0, $index); It's just counting the number of new line characters found up until that index you get from the offset capture.

Related

Printing all occurences of a string in a file using regular expressions using php

Separate each word into a Array

In PHP, if I find a word in a file, can I make the line that the word came from into a $string

regex split string at first line break

extracting multiple fields from a text file using php

Categories

Resources