php regular expression to find a string in a file - php

I am trying to make a php script to output all values inside an hypehref=" ** " from a text file.
I'm having a hard time with the regular expression part.
This is what I have
$Vdata = file_get_contents('file.txt');
preg_match( '/hyperef="(.*?)"/', $Vdata, $match );
echo '<pre>'; print_r($match); echo '</pre>'
My result is this :
Array
(
[0] => hyperef="http://lookbook.nu/look/5709720-Choies-Silvery-Bag-Rosewholesale-Punk-Style/hype"
[1] => http://lookbook.nu/look/5709720-Choies-Silvery-Bag-Rosewholesale-Punk-Style/hype
)
The [0] is incorrect, it includes the part I am searching for... all I want is the result after the hypehref="
The second result [1] is correct
and my file should have given me about 10 results.. not just 2...
Any ideas why ? Thx

You can use preg_match_all to find all matches. There you will also have the full part and only the value of the hyperef - but you can only use the former.
if (preg_match_all('/hyperef="(.*?)"/i', $Vdata, $result)) {
$matches = $result[1]; //only values inside quotation marks
print_r($matches);
} else
print "nothing found";
I added the if for obvious reasons and the i delimiter, so the pattern will ignore case sensitivity.

$pattern = "/\bo'reilly\b/i"; // only O'Reilly
$ora_books = preg_grep($pattern, file('filed.txt'));
var_dump($ora_books);
example 2
$fh = fopen('/path/to/your/file.txt', 'r') or die($php_errormsg);
while (!feof($fh)) {
$line = fgets($fh);
if (preg_match($pattern, $line)) { $ora_books[ ] = $line; }
}
fclose($fh);

Related

preg_match_all Output with Multiple Possibilities

I am searching a line using preg_match_all, but it is not known exactly what this line will look like. For example, it could look like this:
XXX012-013-015-######
Or it could look like this:
XXX012-013-015-XXX001-002-######
Where the 'X's are any letter and the '#'s are any number.
This is the relevant portion of the preg_match_all code that works exactly as expected if the line was always setup like the first example:
if (preg_match_all('([A-Z]{3})((?:[0-9]{3}[->]{1}){1,32})([0-9]{2})([0-9]{2})([0-9]{2})...rest of code...#', $wwalist, $matches)) {
$wwaInfo['locationabbrev'][$wwanum] = $matches[2][$keys[$wwanum]];
}
The $matches[2] will display "012-013-015" as expected. Since the first part, xxx012-013-015, can repeat, I need for the preg_match_all $matches[2] to display the following if it is run on the second example:
012-013-015-001-002
This was my attempt, but it does not work:
if (preg_match_all('#([A-Z]{3})((?:[0-9]{3}[->]{1}){1,32})((?:[A-Z]{3}){0,1})(?:((?:[0-9]{3}[->]{1}){1,3}){0,3})([0-9]{2})([0-9]{2})([0-9]{2})...rest of code...#', $wwalist, $matches)) {
Hopefully this makes sense. Any help would be much appreciated! Thanks!
You aren't going to be able to match and join matches in the same step.
Will this work for you:
Code: (Pattern Demo) (PHP Demo)
$strings=[
'ABC012-013-015-XYZ001-002-345435',
'ABC012-013-015-345453',
'XYZ013-014-015-016-EFG017-123456'
];
foreach($strings as $s){
if(preg_match('/[A-Z]{3}\d{3}/',$s)){ // check if string qualifies
echo 'Match found. Prepared string: ';
$s=preg_replace('/([A-Z]{3}|-\d{6})/','',$s); // remove unwanted substrings
echo "$s\n";
}
}
Output:
Match found. Prepared string: 012-013-015-001-002
Match found. Prepared string: 012-013-015
Match found. Prepared string: 013-014-015-016-017
You could use a replace call and then output a new string with the matches, so for example:
ABC012-013-015-XYZ001-002-345435
ABC012-013-015-345453
XYZ013-014-015-016-EFG017-123456
$rep = preg_replace( '/(?mi-Us)([^0-9-\n]{3,})|-[0-9]{4,}/', '', $str) ;
echo ( $rep );
Should result in:
012-013-015-001-002
012-013-015
013-014-015-016-017
To output to an array:
$mat = preg_match_all( '/([0-9-]+)\n/', $rep, $res) ;
print_r( $res[1] ) ;
foreach( $res[1] as $result ) {
echo $result . "\n" ;
}
For the code you've shown you could probably do:
$rep = preg_replace( '/(?mi-Us)([^0-9-\n]{3,})|-[0-9]{4,}/', '', $wwalist ) ;
if (preg_match_all('/([0-9-]+)\n/', $rep, $matches)) {
$wwaInfo['locationabbrev'][$wwanum] = $matches[1][$keys[$wwanum]];
print_r( $wwaInfo['locationabbrev'][$wwanum] ); // comment out when done testing
}
Which should return the array:
Array
(
[0] => 012-013-015-001-002
[1] => 012-013-015
[2] => 013-014-015-016-017
)

Find lines that contain given word and return in array

Lets say I have following lines:
How, are you!
Are you there?
Yes, you over there.
I have to display those lines which contains word "there", In my case it should return following array of strings:
["Are you there?", "Yes, you over there."]
What I did is :
$arr = array();
while (!feof($file)) {
$line = fgets($file);
if (strpos($line, $keyword) !== false) {
$arr[]=$line;
}
}
print_r($arr);
return;
How can I get the result in ["Are you there?", "Yes, you over there."] format.
Regex?
preg_match_all("/.* there .*|there .*|.* there.*/", $input_str, $output_array);
You can replace the "there" with a variable.
http://www.phpliveregex.com/p/fLn
Here's a quick way:
$result = preg_grep("/$keyword/", file("/path/to/file.txt"));
file() creates an array of each line
grep over the array for the keyword
If you want that literal string, then that is JSON:
$string = json_encode($result);

Separate each word into a Array

I have a file with contents like :
Apple 100
banana 200
Cat 300
I want to search for a particular string in the file and get the next word. Eg: I search for cat, I get 300. I have looked up this solution: How to Find Next String After the Needle Using Strpos(), but that didn't help and I didn't get the expected output. I would be glad if you can suggest any method without using regex.
I'm not sure this is the best approach, but with the data you've provided, it'll work.
Get the contents of the file with fopen()
Separate the values into array elements with explode()
Iterate over your array and check each element's index as odd or even. Copy to new array.
Not perfect, but on the right track.
<?php
$filename = 'data.txt'; // Let's assume this is the file you mentioned
$handle = fopen($filename, 'r');
$contents = fread($handle, filesize($filename));
$clean = trim(preg_replace('/\s+/', ' ', $contents));
$flat_elems = explode(' ', $clean);
$ii = count($flat_elems);
for ($i = 0; $i < $ii; $i++) {
if ($i%2<1) $multi[$flat_elems[$i]] = $flat_elems[$i+1];
}
print_r($multi);
This outputs a multidimensional array like this:
Array
(
[Apple] => 100
[banana] => 200
[Cat] => 300
)
Try this, it doesn't use regex, but it will be inefficient if the string you're searching is longer:
function get_next_word($string, $preceding_word)
{
// Turns the string into an array by splitting on spaces
$words_as_array = explode(' ', $string);
// Search the array of words for the word before the word we want to return
if (($position = array_search($preceding_word, $words_as_array)) !== FALSE)
return $words_as_array[$position + 1]; // Returns the next word
else
return false; // Could not find word
}
$find = 'Apple';
preg_match_all('/' . $find . '\s(\d+)/', $content, $matches);
print_r($matches);
You might benefit from using named regex subpatterns to capture the information you're looking for.
For example you, finding a number the word that is its former (1 <= value <= 9999)
/*String to search*/
$str = "cat 300";
/*String to find*/
$find = "cat";
/*Search for value*/
preg_match("/^$find+\s*+(?P<value>[0-9]{1,4})$/", $str, $r);
/*Print results*/
print_r($r);
In cases where a match is found the results array will contain the number you're looking for indexed as 'value'.
This approach can be combined with
file_get_contents($file);

PHP preg_match get the last item

I have a PHP string with a list of items, and I would like to get the last one.
The reality is much more complex, but it boils down to:
$Line = 'First|Second|Third';
if ( preg_match( '#^.*|(?P<last>.+)$#', $Line, $Matches ) > 0 )
{
print_r($Matches);
}
I expect Matches['last'] to contain 'Third', but it does not work. Rather, I get Matches[0] to contain the full string and nothing else.
What am I doing wrong?
Please no workarounds, I can do it myself but I would really like to have this working with preg_match
You have this:
'#^.*|(?P<last>.+)$#'
^
... but I guess you are looking for a literal |:
'#^.*\|(?P<last>.+)$#'
^^
Just use :
$Line = 'First|Second|Third' ;
$lastword = explode('|', $line);
echo $lastword['2'];
If your syntax is always kinda the same, and by that I mean that uses the | as separator, you could do the following, if you like it.
$Line = 'First|Second|Third' ;
$line_array = explode('|', $Line);
$line_count = count($line_array) - 1;
echo $line_array[$line_count];
or
$Line = 'First|Second|Third' ;
$line_array = explode('|', $Line);
end($line_array);
echo $line_array[key($line_array)];
Example of PHP preg_match to get the last match:
<?php
$mystring = "stuff://sometext/2010-01-01/foobar/2016-12-12.csv";
preg_match_all('/\d{4}\-\d{2}\-\d{2}/', $mystring, $matches);
print_r($matches);
print("\nlast match: \n");
print_r($matches[0][count($matches[0])-1]);
print("\n");
?>
Prints the whole object returned and the last match:
Array
(
[0] => Array
(
[0] => 2010-01-01
[1] => 2016-12-12
)
)
last match:
2016-12-12

PHP - get line number of regex result

I'm trying to write some PHP that will read a CSS file, find all occurrences of the #group comment, and their line number. This is what I have so far, but it's returning the character count rather than the line number.
$file = 'master.css';
$string = file_get_contents($file);
$matches = array();
preg_match_all('/\/\* #group.*?\*\//m', $string, $matches, PREG_OFFSET_CAPTURE);
list($capture, $offset) = $matches[0];
$line_number = substr_count(substr($string, 0, $offset), "\n") + 1;
echo '<pre>';
print_r($matches[0]);
echo '</pre>';
Try using file() rather than file_get_contents(). The difference is that file() returns the file contents as an array, one element per line, rather than as a string like file_get_contents does. I should note that file() returns the newline character at the end of each line as part of the array element. If you don't want that, add the FILE_IGNORE_NEW_LINES flag as a second parameter.
From there, you can use preg_grep() to return only the elements in the initial array. You can read their indexes to determine which lines matched, if you only want the line numbers:
An example:
myfile.txt:
hello world
how are you
say hello back!
line_find.php:
$filename = "myfile.txt";
$fileContents = file($filename);
$pattern = "/hello/";
$linesFound = preg_grep($pattern, $fileContents);
echo "<pre>", print_r($linesFound, true), "</pre>";
Result:
Array
(
[0] => hello world
[2] => say hello back!
)
Hope that helps.
This is not going to be optimal, but if you don't care about that :
$line_number = 1 + substr_count($string, "\n", 0, $index);
It's just counting the number of new line characters found up until that index you get from the offset capture.

Categories