Regex Match Exact Number at beginning (like 99 but not 999) - php

This should be a simple task, but searching for it all day I still can't figure out what I'm missing
I'm trying to open a file using PHP's glob() that begins with a specific number
Example filenames in a directory:
1.txt
123.txt
10 some text.txt
100 Some Other Text.txt
The filenames always begin with a unique number (which is what i need to use to find the right file) and are optionally followed by a space and some text and finally the .txt extension
My problem is that no matter what I do, if i try to match the number 1 in the example folder above it will match every file that begins with 1, but I need to open only the file that starts with exactly 1, no matter what follows it, whether it be a space and text or just .txt
Some example regex that does not succeed at the task:
filepath/($number)*.txt
filepath/([($number)])( |*.?)*.txt
filepath/($number)( |*.?)*.txt
I'm sure there's a very simple solution to this... If possible I'd like to avoid loading every single file into a PHP array and using PHP to check every item for the one that begins with only the exact number, when surely regex can do it in a single action
A bonus would be if you also know how to turn the optional text between the number and the extension into a variable, but that is entirely optional as it's my next task after I figure this one out

The Regex you want to use is: ^99(\D+\.txt)$
$re = "/^99(\D+\.txt)$/";
preg_match($re, $str, $matches);
This will match:
99.txt
99files.txt
but not:
199.txt
999.txt
99
99.txt.xml
99filesoftxt.dat
The ( ) around the \D+.txt will create a capturing group which will contain your file name.

I believe this is what you want OP:
$regex = '/' . $number . '[^0-9][\S\s]+/';
This matches the number, then any character that isn't a number, then any other characters. If the number is 1, this would match:
1.txt
1abc.txt
1 abc.txt
1_abc.txt
1qrx.txt
But it would not match:
1
12.txt
2.txt
11.txt
1.

Here you go:
<?php
function findFileWithNumericPrefix($filepath, $prefix)
{
if (($dir = opendir($filepath)) === false) {
return false;
}
while (($filename = readdir($dir)) !== false) {
if (preg_match("/^$prefix\D/", $filename) === 1) {
closedir($dir);
return $filename;
}
}
closedir($dir);
return false;
}
$file = findFileWithNumericPrefix('/base/file/path', 1);
if ($file !== false) {
echo "Found file: $file";
}
?>
With your example directory listing, the result is:
Found file: 1.txt

You can use a regex like this:
^10\D.*txt$
^--- use the number you want
Working demo
For intance:
$re = "/^10\\D.*txt$/m";
$str = "1.txt\n123.txt\n10 some text2.txt\n100 Some Other2 Text.txt";
preg_match_all($re, $str, $matches);
// will match only 10 some text.txt

Related

PHP to search a word within txt file and echo the whole line [duplicate]

This question already has an answer here:
How to find a whole word in a string in PHP without accidental matches?
(1 answer)
Closed 2 years ago.
This might look like a duplicate but Its a different issue. I'll almost copy/paste another Question but I'm asking for a different issue. Also since that thread owner asked it very well and understandable I will describe it like he did.
I have a normal text files with each line having data in the following format.
Username | Age | Street
Now what I wanted to do was to search for the Username in the file and when found It will print the whole line. The question below does this perfectly with one main problem:
PHP to search within txt file and echo the whole line
Issue: If you have the name "Tobias" and search for "Tobi" it will find it and disply "Tobias" but I only want to search a whole word that your using as the search string. If I want to search for "Tobi" it should only find "Tobi" and not "Tobias" or every other string containing the word "Tobi".
It works using this solution: https://stackoverflow.com/a/4366744/14071499
But that also has the issue that using the solution above would only print the string that I am searching for and doesn't print the whole line.
So how am I able to search for a word and printing the whole line afterwards without also finding other string that aren't only the word but containing it?
The Code I have so far:
<?php
$file = 'ids.txt';
$searchfor = $_POST['search'];
// the following line prevents the browser from parsing this as HTML.
header('Content-Type: text/plain');
// get the file contents, assuming the file to be readable (and exist)
$contents = file_get_contents($file);
// escape special characters in the query
$pattern = preg_quote($searchfor, '/');
// finalise the regular expression, matching the whole line
$pattern = "/\b{$pattern}.*\$/m";
// search, and store all matching occurences in $matches
if(preg_match_all($pattern, $contents, $matches)){
echo "Found matches:\n";
echo implode("\n", $matches[0]);
}
else{
echo "No matches found";
}
?>
This answer doesn't take into account fields in your source data, since at the moment you're just bulk-matching the raw text and interested in getting full lines. There is a much simpler way to accomplish this, ie. by using file that loads each line into an array member, and the application of preg_grep that filters an array with a regular expression. Implemented as follows:
$lines = file('ids.txt', FILE_IGNORE_NEW_LINES|FILE_SKIP_EMPTY_LINES); // lines as array
$search = preg_quote($_POST['search'], '~');
$matches = preg_grep('~\b' . $search . '\b~', $lines);
foreach($matches as $line => $match) {
echo "Line {$line}: {$match}\n";
}
In related notes, to match only complete words, instead of substrings, you need to have word boundaries \b on both sides of the pattern. The loop above outputs both the match and the line number (0-indexed), since array index keys are saved when using preg_grep.
<?php
$file = "ids.txt";
$search = $_POST["search"];
header("Content-Type: text/plain");
$contents = file_get_contents($file);
$lines = explode("\n", $contents);
foreach ($lines as $line) {
if (preg_match("/\b${search}\b/", $line, $matches)) {
echo $line;
}
}

php - regex exact matches

I have the following strings:
Falchion-Case
P90-ASH-WOOD-WELL-WORN
I also have the following URLS which are inside a text file:
http://csgo.steamanalyst.com/id/115714004/FALCHION-CASE-KEY-
http://csgo.steamanalyst.com/id/115716486/FALCHION-CASE-
http://csgo.steamanalyst.com/id/2018/P90-ASH-WOOD-WELL-WORN
I'm looping through each line in the text file and checking if the string is found inside the URL:
// Read from file
if (stristr($item, "stattrak") === FALSE) {
$lines = file(public_path().'/csgoanalyst.txt');
foreach ($lines as $line) {
// Check if the line contains the string we're looking for, and print if it does
if(preg_match('/(?<!\-)('.$item.')\b/',$line) != false) { // case insensitive
echo $line;
break;
}
}
}
This works perfectly when $item = P90-ASH-WOOD-WELL-WORN however when $item = Falchion-Case It matches on both URL's when only the second: http://csgo.steamanalyst.com/id/115716486/FALCHION-CASE- is valid
Try modifying your regx to match the end of the line, assuming the line ends
'/('.$item.')$/'
This would match
http://csgo.steamanalyst.com/id/115714004/FALCHION-CASE-KEY- <<end of line
Basically do an ends with type match, you can do this too
'/('.$item.')\-?$/'
to optionally match an ending hyphen
You can also use a negative lookahead to negate that unwanted case:
preg_match('/(?<!-)'. preg_quote($item) .'(?!-\w)/i', $line);

Regex to match numbers separated by dash (-) and get substring

I have a lot of image files in a directory with their ID between some descriptions about what it contains.
This is an example of the files in that directory:
de-te-mo-01-19-1084 moldura.JPG, ce-ld-ns-02-40-0453 senal.JPG, dp-bs-gu-01-43-1597-guante.JPG, am-ca-tw-04-30-2436 Tweter.JPG, am-ma-ac-02-26-0745 aceite.JPG, ca-cc-01-43-1427-F.jpg
What I want is to get the ID of the image *(nn-nn-nnnn) and rename the file with that sub-string.
*n as a number.
The result from the list above would be like: 01-19-1084.JPG, 02-40-0453.JPG, 01-43-1597.JPG, 04-30-2436.JPG, 02-26-0745.JPG, 01-43-1427.jpg.
This is the code I'm using to loop the directory:
$dir = "images";
// Open a known directory, and proceed to read its contents
if (is_dir($dir)) {
if ($dh = opendir($dir)) {
while (($file = readdir($dh)) !== false) {
if($sub_str = preg_match($patern, $file))
{
rename($dir.'/'.$file, $sub_str.'JPG');
}
}
closedir($dh);
}
}
So, how my $patern would be to get what I want?
$pattern must be like this:
$pattern = "/^.*(\d{2}-\d{2}-\d{4}).*\.jpg$/i"
This pattern can check file name and get id as match group. Also preg_math return number, not string. Matches return as third param of function. while body must looks like this:
if(preg_match($patern, $file, $matches))
{
rename($dir.'/'.$file, $matches[1].'.JPG');
}
$matches is array with matched string and groups.
Wouldn't that be just like:
^.*([0-9]{2})-([0-9]{2})-([0-9]{4}).*\.jpg$
Explain:
^ Start of string
.* Match any characters by any number 0+
([0-9]{2}) 2 Digits
- Just a - char
([0-9]{2}) 2 Digits
- Just a - char
([0-9]{4}) 4 Digits
- Just a - char
.* Any character
\.jpg Extension and escape wildcard
$ End of string
Now you got 3 groups inside the (). You have to use index 1, 2 and 3.

In PHP Remove several characters from the beginning of a String?

I need to find a specic line of text, from a text-file,
and then copy it to a new text-file:
1: I have a text file with several lines of text, eg:
JOHN
MIKE
BEN
*BJAMES
PETE
2: So, I read that text-files contents into an array,
with each line of text, placed into a seperate element of the array.
3: Then I tested each element of the array,
to find the line that starts with, say: *B ie:
if ( preg_match( "/^\*(B)/",$contents[$a] ) )
Which works ok...
4: Then I copy (WRITE) that line of text, to a new text-file.
Q: So how can I remove the '*B' from that line of text,
BEFORE I WRITE it to the new text-file ?
If you already use preg_match, you can modify your regex to get what you want in another variable.
if (preg_match('/^\*B(.*)$/', $contens[$a], $matches)
{
fwrite($targetPointer, $matches[1]);
}
After using preg_matchthe variable $matches holds the single matches of subparts of the regex enclosed in brackets. So the relevant part of your line ist matched by (.*) and saved into $matches[1].
This approach writes the lines as the file is read, which is more memory efficient:
$sourceFile = new SplFileObject('source.txt');
$destinationFile = new SplFileObject('destination.txt', 'w+');
foreach (new RegexIterator($sourceFile, '/^\*B.*/') as $filteredLine) {
$destinationFile->fwrite(
substr_replace($filteredLine, '', 0, 2)
);
}
demo
With substr or preg_replace.
Have a try with:
preg_replace('/^\*B/', '', $content[$a], -1, $count);
if ($count) {
fwrite($file, $content[$a]);
}

php simplest case regex replacement, but backtraces not working

Hacking up what I thought was the second simplest type of regex (extract a matching string from some strings, and use it) in php, but regex grouping seems to be tripping me up.
Objective
take a ls of files, output the commands to format/copy the files to have the correct naming format.
Resize copies of the files to create thumbnails. (not even dealing with that step yet)
Failure
My code fails at the regex step, because although I just want to filter out everything except a single regex group, when I get the results, it's always returning the group that I want -and- the group before it, even though I in no way requested the first backtrace group.
Here is a fully functioning, runnable version of the code on the online ide:
http://ideone.com/2RiqN
And here is the code (with a cut down initial dataset, although I don't expect that to matter at all):
<?php
// Long list of image names.
$file_data = <<<HEREDOC
07184_A.jpg
Adrian-Chelsea-C08752_A.jpg
Air-Adams-Cap-Toe-Oxford-C09167_A.jpg
Air-Adams-Split-Toe-Oxford-C09161_A.jpg
Air-Adams-Venetian-C09165_A.jpg
Air-Aiden-Casual-Camp-Moc-C09347_A.jpg
C05820_A.jpg
C06588_A.jpg
Air-Aiden-Classic-Bit-C09007_A.jpg
Work-Moc-Toe-Boot-C09095_A.jpg
HEREDOC;
if($file_data){
$files = preg_split("/[\s,]+/", $file_data);
// Split up the files based on the newlines.
}
$rename_candidates = array();
$i = 0;
foreach($files as $file){
$string = $file;
$pattern = '#(\w)(\d+)_A\.jpg$#i';
// Use the second regex group for the results.
$replacement = '$2';
// This should return only group 2 (any number of digits), but instead group 1 is somehow always in there.
$new_file_part = preg_replace($pattern, $replacement, $string);
// Example good end result: <img src="images/ch/ch-07184fs.jpg" width="350" border="0">
// Save the rename results for further processing later.
$rename_candidates[$i]=array('file'=>$file, 'new_file'=>$new_file_part);
// Rename the images into a standard format.
echo "cp ".$file." ./ch/ch-".$new_file_part."fs.jpg;";
// Echo out some commands for later.
echo "<br>";
$i++;
if($i>10){break;} // Just deal with the first 10 for now.
}
?>
Intended result for the regex: 788750
Intended result for the code output (multiple lines of): cp air-something-something-C485850_A.jpg ./ch/ch-485850.jpg;
What's wrong with my regex? Suggestions for simpler matching code would be appreciated as well.
Just a guess:
$pattern = '#^.*?(\w)(\d+)_A\.jpg$#i';
This includes the whole filename in the match. Otherwise preg_replace() will really only substitute the end of each string - it only applies the $replacement expression on the part that was actually matched.
Scan Dir and Expode
You know what? A simpler way to do it in php is to use scandir and explode combo
$dir = scandir('/path/to/directory');
foreach($dir as $file)
{
$ext = pathinfo($file,PATHINFO_EXTENSION);
if($ext!='jpg') continue;
$a = explode('-',$file); //grab the end of the string after the -
$newfilename = end($a); //if there is no dash just take the whole string
$newlocation = './ch/ch-'.str_replace(array('C','_A'),'', basename($newfilename,'.jpg')).'fs.jpg';
echo "#copy($file, $newlocation)\n";
}
#and you are done :)
explode: basically a filename like blah-2.jpg is turned into a an array('blah','2.jpg); and then taking the end() of that gets the last element. It's the same almost as array_pop();
Working Example
Here's my ideaone code http://ideone.com/gLSxA

Categories