I am currently trying to add tokens to a CMS using PHP.
The user can enter (into a WYSIWYG Editor) a string such as [my_include.php]. We would like to extract anything with this format, and turn it into an include of the following format:
include('my_include.php');
Can anyone assist with composing the RegExp and extraction process to allow this? Ideally, I would like to extract them all into a single array, so that we can provide some checking before parsing it as the include();?
Thanks!
preg_replace('~\[([^\]]+)\]~', 'include "\\1";', $str);
Working sample: http://ideone.com/zkwX7
You'll either want to go with preg_match_all(), run the results in a loop and replace whatever you found. Might be a bit faster than the following callback solution, but is a bit more tricky if PREG_OFFSET_CAPUTRE and substr_replace() is used.
<?php
function handle_replace_thingie($matches) {
// build a file path
$file = '/path/to/' . trim($matches[1]);
// do some sanity checks, like file_exists, file-location (not that someone includes /etc/passwd or something)
// check realpath(), file_exists()
// limit the readable files to certain directories
if (false) {
return $matches[0]; // return original, no replacement
}
// assuming the include file outputs its stuff we need to capture it with an output buffer
ob_start();
// execute the include
include $file;
// grab the buffer's contents
$res = ob_get_contents();
ob_end_clean();
// return the contents to replace the original [foo.php]
return $res;
}
$string = "hello world, [my_include.php] and [foo-bar.php] should be replaced";
$string = preg_replace_callback('#\[([^\[]+)\]#', 'handle_replace_thingie', $string);
echo $string, "\n";
?>
Using preg_match_all(), you could do this:
$matches = array();
// If we've found any matches, do stuff with them
if(preg_match_all("/\[.+\.php\]/i", $input, $matches))
{
foreach($matches as $match)
{
// Any validation code goes here
include_once("/path/to/" . $match);
}
}
The regex used here is \[.+\.php\]. This will match any *.php string so that if the user types [hello] for example, it won't match.
Related
This question already has an answer here:
How to find a whole word in a string in PHP without accidental matches?
(1 answer)
Closed 2 years ago.
This might look like a duplicate but Its a different issue. I'll almost copy/paste another Question but I'm asking for a different issue. Also since that thread owner asked it very well and understandable I will describe it like he did.
I have a normal text files with each line having data in the following format.
Username | Age | Street
Now what I wanted to do was to search for the Username in the file and when found It will print the whole line. The question below does this perfectly with one main problem:
PHP to search within txt file and echo the whole line
Issue: If you have the name "Tobias" and search for "Tobi" it will find it and disply "Tobias" but I only want to search a whole word that your using as the search string. If I want to search for "Tobi" it should only find "Tobi" and not "Tobias" or every other string containing the word "Tobi".
It works using this solution: https://stackoverflow.com/a/4366744/14071499
But that also has the issue that using the solution above would only print the string that I am searching for and doesn't print the whole line.
So how am I able to search for a word and printing the whole line afterwards without also finding other string that aren't only the word but containing it?
The Code I have so far:
<?php
$file = 'ids.txt';
$searchfor = $_POST['search'];
// the following line prevents the browser from parsing this as HTML.
header('Content-Type: text/plain');
// get the file contents, assuming the file to be readable (and exist)
$contents = file_get_contents($file);
// escape special characters in the query
$pattern = preg_quote($searchfor, '/');
// finalise the regular expression, matching the whole line
$pattern = "/\b{$pattern}.*\$/m";
// search, and store all matching occurences in $matches
if(preg_match_all($pattern, $contents, $matches)){
echo "Found matches:\n";
echo implode("\n", $matches[0]);
}
else{
echo "No matches found";
}
?>
This answer doesn't take into account fields in your source data, since at the moment you're just bulk-matching the raw text and interested in getting full lines. There is a much simpler way to accomplish this, ie. by using file that loads each line into an array member, and the application of preg_grep that filters an array with a regular expression. Implemented as follows:
$lines = file('ids.txt', FILE_IGNORE_NEW_LINES|FILE_SKIP_EMPTY_LINES); // lines as array
$search = preg_quote($_POST['search'], '~');
$matches = preg_grep('~\b' . $search . '\b~', $lines);
foreach($matches as $line => $match) {
echo "Line {$line}: {$match}\n";
}
In related notes, to match only complete words, instead of substrings, you need to have word boundaries \b on both sides of the pattern. The loop above outputs both the match and the line number (0-indexed), since array index keys are saved when using preg_grep.
<?php
$file = "ids.txt";
$search = $_POST["search"];
header("Content-Type: text/plain");
$contents = file_get_contents($file);
$lines = explode("\n", $contents);
foreach ($lines as $line) {
if (preg_match("/\b${search}\b/", $line, $matches)) {
echo $line;
}
}
I am trying to echo out the names/paths of the files that are written in logfile.txt. For that, I use a regex to match everything before the first ocurrence of : and output it. I am reading the logfile.txt line by line:
<?php
$logfile = fopen("logfile.txt", "r");
if ($logfile) {
while (($line = fgets($logfile)) !== false) {
if (preg_match_all("/[^:]*/", $line, $matched)) {
foreach ($matched as $val) {
foreach ($val as $read) {
echo '<pre>'. $read . '</pre>';
}
}
}
}
fclose($logfile);
} else {
die("Unable to open file.");
}
?>
However, I get the entire contents of the file instead. The desired output would be:
/home/user/public_html/an-ordinary-shell.php
/home/user/public_html/content/execution-after-redirect.html
/home/user/public_html/paypal-gateway.html
Here is the content of logfile.txt:
-------------------------------------------------------------------------------
/home/user/public_html/an-ordinary-shell.php: Php.Trojan.PCT4-1 FOUND
/home/user/public_html/content/execution-after-redirect.html: {LDB}VT-malware33.UNOFFICIAL FOUND
/home/user/public_html/paypal-gateway.html: Html.Exploit.CVE.2015_6073
Extra question: How do I skip reading the first two lines (namely the dashes and emtpy line)?
Here you go:
<?php
# load it as a string
$data = #file("logfile.txt");
# data for this specific purpose
$data = <<< DATA
-------------------------------------------------------------------------------
/home/user/public_html/an-ordinary-shell.php: Php.Trojan.PCT4-1 FOUND
/home/user/public_html/content/execution-after-redirect.html: {LDB}VT-malware33.UNOFFICIAL FOUND
/home/user/public_html/paypal-gateway.html: Html.Exploit.CVE.2015_6073
DATA;
$regex = '~^(/[^:]+):~m';
# ^ - anchor it to the beginning
# / - a slash
# ([^:]+) capture at least anything NOT a colon
# turn on multiline mode with m
preg_match_all($regex, $data, $files);
print_r($files);
?>
It even skips both your lines, see a demo on ideone.com.
preg_match_all returns all occurrences for the pattern. For the first line, it will return:
/home/user/public_html/an-ordinary-shell.php,an empty string, Php.Trojan.PCT4-1 FOUND
and an other empty string
that don't contain :.
to obtain a single result, use preg_match, but to do that using explode should suffice.
To skip lines you don't want, you can for example build a generator function that gives only the good lines. You can also use a stream filter.
im trying to create a system where a user can type in a phrase into a rich text editor such as '{item(5)}', then when the code renders the content on a page in the front end the '{item(5)}' is replaced with a snippet of code / function that uses the 5 as an unique identifier
I guess similar to how a wordpress widget would work,
im not to familiar using preg_ functions but have managed to pull out the {item(5)} and replace with a function, however the problem is it removes the rest of the content.
i might not be not be on the right lines but here is the code so far, any help would be most appreciated
$string ='This is my body of text, you should all check out this item {item(7)} or even this item {item(21)} they are great...';
if(preg_match_all('#{item((?:.*?))}#is', $string, $output, PREG_PATTERN_ORDER))
$matches = $output[0];
foreach($matches as $match){
item_widget(preg_replace("/[^0-9]/", '', $match));
}
The item_widget is just a function that uses the number to bring out a html chunk
You probably want a preg_replace_callback instead:
$output = preg_replace_callback('/\{item\((\d+)\)\}/', function($match) {
// item_widget should *return* its result for you to insert into your stream
return item_widget($match[1]);
}, $string);
This will replace the {item(n)} markers with the relevant widget results, assuming - as mentioned in the code comment - that it actually returns its code.
This works for me:
{item\(([\d]+)\)}
Checkout this eval.
so there are two parts for this question. First of all, you need to write a tag extracting part and then substitution part:
<?php
$in = "foo bar {item(1)} {item(2)}";
$out = $in;
if ($m = preg_match_all("/({item\([0-9]+\)})/is",$in,$matches)){
foreach ($matches[1] as $match){
preg_match("/\(([0-9]+)\)/", $match, $t);
$id = $t[1];
/* now we have id, do the substitution */
$out = preg_replace("/".preg_quote($match) . "/", "foo($id)", $out);
}
}
now $out should have the replaced string.
I'm newbie to php
And I need to get two results from the same page. og:image and og:video
This my current code
preg_match('/property="og:video" content="(.*?)"/', file_get_contents($url), $matchesVideo);
preg_match('/property="og:image" content="(.*?)"/', file_get_contents($url), $matchesThumb);
$videoID = ($matchesVideo[1]) ? $matchesVideo[1] : false;
$videoThumb = ($matchesThumb[1]) ? $matchesThumb[1] : false;
Is there a way to execute the same operation without duplicating my code
Save the file contents to a variable, and if you want to run a single regular expression, you can opt for:
$file = file_get_contents($url);
preg_match_all('/property="og:(?P<type>video|image)" content="(?P<content>.*?)"/', $file, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
$match['type'] ...
$match['content'] ...
}
As #hakre points out, the first parenthesis pair is not needed:
The first parenthesis pair uses the no capture modifier ?:, it causes a match but is not stored
Capture groups use named subpatterns ?P<name>, the second capture group establish any of the two words is a possible match image|video.
There is no problem with having those two lines. What I would change though is the double call to file_get_contents($url).
Just change it to:
$html = file_get_contents($url);
preg_match('/property="og:video" content="(.*?)"/', $html, $matchesVideo);
preg_match('/property="og:image" content="(.*?)"/', $html, $matchesThumb);
Is there a way to execute the same operation without duplicating my code
There are always two ways to do that:
Buffer an execution result - instead of executing multiple times.
Encode the repetition - extract parameters from code.
In programming you normally make use of both. For example the buffering of the file I/O operation:
$buffer = file_get_contents($url);
And for the matching, you encode the repetition:
$match = function ($what) use ($buffer) {
$pattern = sprintf('/property="og:%s" content="(.*?)"/', $what);
$result = preg_match($pattern, $buffer, $matches);
return $result ? $matches[1] : NULL;
}
$match('video');
$match('image');
This is only exemplary to show what I meant. It depends a bit how much you want to do this, e.g. the later allows to replace the matching with a different implementation like using a HTML parser but you might find it too much code at the moment for what you need to do and only go with the buffering.
E.g. the following could be applicable as well:
$buffer = file_get_contents($url);
$mask = '/property="og:%s" content="(.*?)"/';
preg_match(sprintf($mask, 'video'), $buffer, $matchesVideo);
preg_match(sprintf($mask, 'image'), $buffer, $matchesThumb);
Hope this helps.
Hacking up what I thought was the second simplest type of regex (extract a matching string from some strings, and use it) in php, but regex grouping seems to be tripping me up.
Objective
take a ls of files, output the commands to format/copy the files to have the correct naming format.
Resize copies of the files to create thumbnails. (not even dealing with that step yet)
Failure
My code fails at the regex step, because although I just want to filter out everything except a single regex group, when I get the results, it's always returning the group that I want -and- the group before it, even though I in no way requested the first backtrace group.
Here is a fully functioning, runnable version of the code on the online ide:
http://ideone.com/2RiqN
And here is the code (with a cut down initial dataset, although I don't expect that to matter at all):
<?php
// Long list of image names.
$file_data = <<<HEREDOC
07184_A.jpg
Adrian-Chelsea-C08752_A.jpg
Air-Adams-Cap-Toe-Oxford-C09167_A.jpg
Air-Adams-Split-Toe-Oxford-C09161_A.jpg
Air-Adams-Venetian-C09165_A.jpg
Air-Aiden-Casual-Camp-Moc-C09347_A.jpg
C05820_A.jpg
C06588_A.jpg
Air-Aiden-Classic-Bit-C09007_A.jpg
Work-Moc-Toe-Boot-C09095_A.jpg
HEREDOC;
if($file_data){
$files = preg_split("/[\s,]+/", $file_data);
// Split up the files based on the newlines.
}
$rename_candidates = array();
$i = 0;
foreach($files as $file){
$string = $file;
$pattern = '#(\w)(\d+)_A\.jpg$#i';
// Use the second regex group for the results.
$replacement = '$2';
// This should return only group 2 (any number of digits), but instead group 1 is somehow always in there.
$new_file_part = preg_replace($pattern, $replacement, $string);
// Example good end result: <img src="images/ch/ch-07184fs.jpg" width="350" border="0">
// Save the rename results for further processing later.
$rename_candidates[$i]=array('file'=>$file, 'new_file'=>$new_file_part);
// Rename the images into a standard format.
echo "cp ".$file." ./ch/ch-".$new_file_part."fs.jpg;";
// Echo out some commands for later.
echo "<br>";
$i++;
if($i>10){break;} // Just deal with the first 10 for now.
}
?>
Intended result for the regex: 788750
Intended result for the code output (multiple lines of): cp air-something-something-C485850_A.jpg ./ch/ch-485850.jpg;
What's wrong with my regex? Suggestions for simpler matching code would be appreciated as well.
Just a guess:
$pattern = '#^.*?(\w)(\d+)_A\.jpg$#i';
This includes the whole filename in the match. Otherwise preg_replace() will really only substitute the end of each string - it only applies the $replacement expression on the part that was actually matched.
Scan Dir and Expode
You know what? A simpler way to do it in php is to use scandir and explode combo
$dir = scandir('/path/to/directory');
foreach($dir as $file)
{
$ext = pathinfo($file,PATHINFO_EXTENSION);
if($ext!='jpg') continue;
$a = explode('-',$file); //grab the end of the string after the -
$newfilename = end($a); //if there is no dash just take the whole string
$newlocation = './ch/ch-'.str_replace(array('C','_A'),'', basename($newfilename,'.jpg')).'fs.jpg';
echo "#copy($file, $newlocation)\n";
}
#and you are done :)
explode: basically a filename like blah-2.jpg is turned into a an array('blah','2.jpg); and then taking the end() of that gets the last element. It's the same almost as array_pop();
Working Example
Here's my ideaone code http://ideone.com/gLSxA