The routine below does two scans over an input stream of hypertext. The first pass is a spin replacement on user defined phrase options. The second pass is a find replace on the tags collection in the doReplace function below.
I'm just looking for suggestions on how it might be optimized. I'm having no performance issues as is. But I want to build for scalability.
/* FIND REPLACE SPIN
--------------------------------------------------------------------*/
function doReplace($content)
{
// content is a precompiled text document formatted with html and
// special using replacement tags matching the $tags array collection below
$tags = array('[blog-name]', '[blog-url]', '[blog-email]');
$replacements = array('value1', 'value2', 'value3');
$content = str_replace($tags, $replacements, $content);
return $content;
}
function doSpin($content) {
// the content also has phrase option tags denoted by [%phrase1|phrase2_|phrase3%]
// delimiters throughout the text.
return preg_replace_callback('!\[%(.*?)%\]!', 'pick_one', $content);
}
function pick_one($matches) {
$choices = explode('|', $matches[1]);
return $choices[rand(0, count($choices)-1)];
}
$my_source_page = file_get_contents('path/to/source';}
$my_source1_spin = doSpin($my_source_page);
$my_source1_replace = doReplace($my_source1_spin);
$my_source1_final = addslashes($my_source1_replace);
//Now do something with $my_source1_final
To be honest, I don't see anything wrong with the code you've posted. The main bottleneck in the code is likely going to be the file_get_contents call.
The only thing I can see myself is that you're allocating the string to different variables (four variables beginning $my_source) which will use more memory than if you just used 1 or 2 variables.
But unless you're reading a large amount of text into memory very frequently on a busy site, then I don't think you need to worry about the code you've posted. And you said yourself, you're not having any performance issues at the moment ;)
Related
I'm currently trying out this PHP preg_replace function and I've run into a small problem. I want to replace all the tags with a div with an ID, unique for every div, so I thought I would add it into a for loop. But in some strange way, it only do the first line and gives it an ID of 49, which is the last ID they can get. Here's my code:
$res = mysqli_query($mysqli, "SELECT * FROM song WHERE id = 1");
$row = mysqli_fetch_assoc($res);
mysqli_set_charset("utf8");
$lyric = $row['lyric'];
$lyricHTML = nl2br($lyric);
$lines_arr = preg_split('[<br />]',$lyricHTML);
$lines = count($lines_arr);
for($i = 0; $i < $lines; $i++) {
$string = preg_replace(']<br />]', '</h4><h4 id="no'.$i.'">', $lyricHTML, 1);
echo $i;
}
echo '<h4>';
echo $string;
echo '</h4>';
How it works is that I have a large amount of text in my database, and when I add it into the lyric variable, it's just plain text. But when I nl2br it, it gets after every line, which I use here. I get the number of by using the little "lines_arr" method as you can see, and then basically iterate in a for loop.
The only problem is that it only outputs on the first line and gives that an ID of 49. When I move it outside the for loop and removes the limit, it works and all lines gets an <h4> around them, but then I don't get the unique ID I need.
This is some text I pulled out from the database
Mama called about the paper turns out they wrote about me
Now my broken heart´s the only thing that's broke about me
So many people should have seen what we got going on
I only wanna put my heart and my life in songs
Writing about the pain I felt with my daddy gone
About the emptiness I felt when I sat alone
About the happiness I feel when I sing it loud
He should have heard the noise we made with the happy crowd
Did my Gran Daddy know he taught me what a poem was
How you can use a sentence or just a simple pause
What will I say when my kids ask me who my daddy was
I thought about it for a while and I'm at a loss
Knowing that I´m gonna live my whole life without him
I found out a lot of things I never knew about him
All I know is that I´ll never really be alone
Cause we gotta lot of love and a happy home
And my goal is to give every line an <h4 id="no1">TEXT</h4> for example, and the number after no, like no1 or no4 should be incremented every iteration, that's why I chose a for-loop.
Looks like you need to escape your regexp
preg_replace('/\[<br \/\]/', ...);
Really though, this is a classic XY Problem. Instead of asking us how to fix your solution, you should ask us how to solve your problem.
Show us some example text in the database and then show us how you would like it to be formatted. It's very likely there's a better way.
I would use array_walk for this. ideone demo here
$lines = preg_split("/[\r\n]+/", $row['lyric']);
array_walk($lines, function(&$line, $idx) {
$line = sprintf("<h4 id='no%d'>%s</h4>", $idx+1, $line);
});
echo implode("\n", $lines);
Output
<h4 id="no1">Mama called about the paper turns out they wrote about me</h4>
<h4 id="no2">Now my broken heart's the only thing that's broke about me</h4>
<h4 id="no3">So many people should have seen what we got going on</h4>
...
<h4 id="no16">Cause we gotta lot of love and a happy home</h4>
Explanation of solution
nl2br doesn't really help us here. It converts \n to <br /> but then we'd just end up splitting the string on the br. We might as well split using \n to start with. I'm going to use /[\r\n]+/ because it splits one or more \r, \n, and \r\n.
$lines = preg_split("/[\r\n]+/", $row['lyric']);
Now we have an array of strings, each containing one line of lyrics. But we want to wrap each string in an <h4 id="noX">...</h4> where X is the number of the line.
Ordinarily we would use array_map for this, but the array_map callback does not receive an index argument. Instead we will use array_walk which does receive the index.
One more note about this line, is the use of &$line as the callback parameter. This allows us to alter the contents of the $line and have it "saved" in our original $lyrics array. (See the Example #1 in the PHP docs to compare the difference).
array_walk($lines, function(&$line, $idx) {
Here's where the h4 comes in. I use sprintf for formatting HTML strings because I think they are more readable. And it allows you to control how the arguments are output without adding a bunch of view logic in the "template".
Here's the world's tiniest template: '<h4 id="no%d">%s</h4>'. It has two inputs, %d and %s. The first will be output as a number (our line number), and the second will be output as a string (our lyrics).
$line = sprintf('<h4 id="no%d">%s</h4>', $idx+1, $line);
Close the array_walk callback function
});
Now $lines is an array of our newly-formatted lyrics. Let's output the lyrics by separating each line with a \n.
echo implode("\n", $lines);
Done!
If your text in db is in every line why just not explode it with \n character?
Always try to find a solution without using preg set of functions, because they are heavy memory consumers:
I would go lke this:
$lyric = $row['lyric'];
$lyrics =explode("\n",$lyrics);
$lyricsHtml=null;
$i=0;
foreach($lyrics as $val){
$i++;
$lyricsHtml[] = '<h4 id="no'.$i.'">'.$val.'</h4>';
}
$lyricsHtml = implode("\n",$lyricsHtml);
An other way with preg_replace_callback:
$id = 0;
$lyric = preg_replace_callback('~(^)|$~m',
function ($m) use (&$id) {
return (isset($m[1])) ? '<h4 id="no' . ++$id . '">' : '</h4>'; },
$lyric);
Problem
I'd like to expand variables in a string in the same manner that variable in a double quoted string get expanded.
$string = '<p>It took $replace s</>';
$replace = 40;
expression_i_look_for;
$string should become '<p>It took 40 s</>';
I see a obvious solution like this:
$string = str_replace('"', '\"', $string);
eval('$string = "$string";');
But I really don't like it, because eval() is insecure. Is there any other way to do this ?
Context
I'm building a simple templateing engine, that's where I need this.
Example Template (view_file.php)
<h1>$title</h1>
<p>$content</p>
Template rendering (simplified code):
$params = array('title' => ...);
function render($view_file, $params)
extract($params)
ob_start();
include($view_file);
$text = ob_get_contents();
ob_end_clean();
expression_i_look_for; // this will expand the variables in the template
return $text;
}
The expansion of the variables in the template simplifies it's syntax. Without it, the above example template would be:
<h1><?php echo $title;?></h1>
<p><?php echo $content;?></p>
Do you think this approach is good ? Or should I look in another direction ?
Edit
Finally I understand that there is no simple solution due to flexible way PHP expands variables (even ${$var}->member[0] would be valid.
So there are only two options:
Adopt an existing full fledged templating system
Stick with something very basic that essentially is limited to including the view files via include.
I would rather suggest using some existing template engines, like for example Smarty, but if you really want to do it by yourself you can use the simple regular expression to match all variables constructed with for example letters and numbers and then replace them with correct variables:
<?php
$text = 'hello $world, what is the $matter? I like $world!';
preg_match_all('/\$([a-zA-Z0-9]+)/',
$text,
$out, PREG_PATTERN_ORDER);
$world = 'World';
$matter = 'matter';
foreach(array_unique($out[1]) as $variable){
$text=str_replace('$'.$variable, $$variable, $text);
}
echo $text;
?>
prints
hello World, what is the matter? I like World!
Parse
Parse the string look for $ followed by valid variable name (i.e. \[a-zA-Z_\x7f-\xff\]\[a-zA-Z0-9_\x7f-\xff\]*)
Variable²
Use variable variables syntax (i.e. $$var notation).
Are you trying to do this?
templater.php:
<?php
$first = "first";
$second = "second";
$third = "third";
include('template.php');
template.php:
<?php
echo 'The '.$first.', '.$second.', and '.$third.' variables in a string!';
When templater.php is run, produces:
"The first, second, and third variables in a string!"
Do you want something like this ?
$replace = 40;
$string = '<p>It took {$replace}s</p>';
Instead of using single quotes
$string = '<p>It took $replace s</>';
$replace = 40;
use double quotes
$replace = 40;
$string = "<p>It took $replace s</>";
However, for readability and to enable you to remove the space between $replace and the s I would use:
$replace = 40;
string = '<p>It took ' . $replace . 's</>';
The correct way is probably to parse your document as a tree, identify your parser tags ( because you are managing your own parser they don't have to follow php conventions if you don't want them to ) and then add in your values from an associative array or other data structure as the opportunity arises.
This is a more complex solution but will make it far easier when you realise that you want to be able to display lists whose length is unknown ahead of time using some kind of looping structure based on a standard display option. In the long run, you won't find many serious templating systems that aren't parsing the documents into some kind of in-memory tree where the placeholders can be inserted and then the document constructed as required. This also offers many opportunities for cacheing. Also, if you are unafraid of recursion you will be able to perform a lot of operations on it fairly simply.
However, this is not an uncommon problem to solve and as I commented on the question, there are almost guaranteed to be libraries and extensions around that provide most of the functionality you need. Unless this is a purely academic process for you, I would find some existing solutions and either use one of those or get a solid understanding of how it works so you have a starting point for adapting your own solution.
This is a snippet I pulled out from Lejlot's answer. I tested it and it works fine.
function resolve_vars_in_str( $input )
{
preg_match_all('/\$([a-zA-Z0-9]+)/', $input, $out, PREG_PATTERN_ORDER);
foreach(array_unique($out[1]) as $variable) $input=str_replace('$'.$variable, $GLOBALS["$variable"], $input);
return $input ;
}
hello im a newbie in php i am trying make a search function using php but only inside the website without any database
basically if i want to search a string namely "Health" it would display the lines
The Joys of Health
Healthy Diets
This snippet is the only thing i could find if properly coded would output the "lines" i want
$myPage = array("directory.php","pages.php");
$lines = file($myPage[n]);
echo $lines[n];
i havent tried it yet if it would work but before i do i want to ask if there is any better way to do this?
if my files have too many lines wont it stress out the server?
The file() function will return an array. You should use file_get_contents() instead, as it returns a string.
Then, use regular expressions to find specific text within a link.
Your goal is fine but the method you're thinking about is not. the file() function read a file, line by line, and inserts it into an array. This assumes the HTML is well-structured in a human-readable fashion, which is not always the case. However, if you're the one providing the HTML and you make sure the structure is perfectly defined, ok... here you have the example you provided us with but complete (take into account it's the 'wrong' way of solving your problem, but if you want to follow that pattern, it's ok):
function pagesearch($pages, $string) {
if (!empty($pages) && !empty($string)) {
$tags = [];
foreach ($pages as $page) {
if ($lines = file($page)) {
foreach ($lines as $line) {
if (!empty($line)) {
if (mb_strpos($line, $string)) {
$tags[$page][] = $line;
}
}
}
}
}
return $tags;
}
}
This will return you an array with all the pages you referenced with all occurrences of the word you look for, separated by page. As I said, it's not the way you want to solve this, but it's a way.
Hope that helps
Because you do not want to use any database and because the term database is very broad and includes the file-system you want to do a search in some database without having a database.
That makes no sense. In your case one database at least is the file-system. If you can accept the fact that you want to search a database (here your html files) but you do not want to use a database to store anything related to the search (e.g. some index or cached results), then what you suggest is basically how it is working: A real-time, text-based, line-by-line file-search.
Sure it is very rudimentary but as your constraint is "no database", you have already found the only possible way. And yes it will stress your server when used because real-time search is expensive.
Otherwise normally Lucene/Solr is used for the job but that is a database and a server even.
Say I have a string "Some text [login_form] some other text". How can I replace '[login_form]' with the PHP code "require('somescript.php');" and run the 'require' function.
I don't want to use 'eval' as my string contain HTML and other code and also has great possibility of errors.
You can do this:
search with regex for [\[(.*)\]]
$replace = include_once($matched_string);
replace [\[(.*)\]] with $replace
Hmm?
Maybe this could also work:
$string = preg_replace_callback(
'/\[(.*)\]/',
function($matches) {
ob_start();
include $matches[1].'.php';
return ob_get_clean();
},
$string
);
EDIT: I saw this approach within a few "home made" CMS but instead of includeing files they all called a class or function. It could be extended with parameters like Hey, check out this new gallery: [gallery, 15, 200, 200].
You parse that string and find out that You have to call object $gallery, probably a method view with the parameters 15, 200, 200 that will be how many images to show per page and the image thumbnail resolution... So You will call $gallery->view(15, 200, 200);.
In this case the above PHP code will be extended to:
$string = preg_replace_callback(
'/\[(.*)\]/',
function($matches) {
$params = explode(', ', $matches[1]); // by this we get an array with object name and all the parameters
$object = array_shift($params);
return ${$object}->view($params); // for simplicity we pass parameters as an array
},
$string
);
Is this what You want to achieve?
I think eval was your only option here. If you so not want to use it for some reasons (which probably are good reasons), you are stuck. Maybe you can give some more context so we can give you other options?
As long as somescript.php is properly formatted and has a closing tag if an open php tag exists, eval will work, although I am not all for eval.
eval('?>'.file_get_contents('somescript.php').'<?');
Can you clarify a little on what you mean by replace [login_form] with a code example you have?
On my site I use output buffering to grab all the output and then run it through a process function before sending it out to the browser (I don't replace anything, just break it into more manageable pieces). In this particular case, there is a massive amount of output because it is listing out a label for every country in the database (around 240 countries). The problem is that in full, my preg_match functions seems to get skipped over, it does absolutely nothing and returns no matches. However, if I remove parts of the labels (no particular part, just random pieces to reduce characters) then the preg_match functions works again. It doesn't seem to matter what I remove from the label, it just seems to be that as long as I remove so many characters. Is there some sort of cap on what the preg functions can handle or will it time out if there is too much data to be scanned over?
Edit: Here is the function that it is run through.
public function boom($data) {
$number = preg_match_all("/(<!-- ([\w]+):start -->)\n?(.*?)\n?(<!-- \\2:stop -->)/s", $data, $matches, PREG_SET_ORDER);
if ($number == 0) $data = array("content" => $data);
else unset($data);
foreach ($matches as $item):
//$item[3] = preg_replace("/\n/", "", $item[3], 1);
if (preg_match("/<!-- breaker -->/s", $item[3])) $data[$item[2]] = explode("<!-- breaker -->", $item[3]);
else $data[$item[2]] = $item[3];
endforeach;
//die(var_dump($data));
return $data;
}
And here is the unprocessed output that is sent to the page. I have determined that it is the preg_match_all at the beginning that is returning 0 matches in the variable, so the function is simply throwing the entire string it received into $data['content'] and skipping everything else.
http://pastebin.com/iGfM6gxx
I've tried putting the labels on new lines, collapsing them together, nothing seems to work. But as explained above, if I remove random pieces of it, then it goes through fine. The function works perfectly fine with every other page of normal length.
it's hard to say without seeing your regex and data, but try to change pcre.backtrack_limit / pcre.recursion_limit ( http://www.php.net/manual/en/pcre.configuration.php)