Remove all non php data from string - php

I want to be able to remove all non php data from a string / file.
Now this preg_replace line works perfectly:
preg_replace('/\?>.*\<?/', '', $src); // Remove all non php data
BUT... problem is that it works only for the first match and not for all of the string/file...
Small tweak needed here ;)

It would be simpler the other way round:
preg_match_all('~<\?.+?\?>~s', $src, $m);
$php = implode('', $m[0]);
Matching non-php blocks is much trickier, because they can also occur before the first php block and after the last one: blah <? php ?> blah.
Also note that no regex solution can handle <?'s inside php strings, as in:
<? echo "hi ?>"; ?>
You have to use tokenizer to parse this correctly.

Related

How to not close string in PHP?

I have a bit of code that is in hundreds of my pages and I need to take it off. the problem is that it has <?php ?> tags, " and '.
What I was thinking to do was turn the bit of code in a string and use str_replace() once I fopened the file, but the ' and " are closing the string, making it impossible for me to do.
For example, it's something like this:
<?php $x = "test"; echo '1234;' ?>< ?php $y = 'testing' ?>
Is there a way to do stop it from closing strings? Or do you suggest any other solution?
PHP is not recursively embeddable or executable. Just because your files contain PHP code doesn't mean that php code magically special - inside the file, it's just text, like any OTHER text. You can search/replace as you want
$code = file_get_contents('somefile.php');
$fixed = str_replace('<?php blah blah blah ?>', '', $code);
file_put_contents('somefile.php', $fixed);
And note that that is literal PHP code inside the str_replace call - like I said, PHP is not recursively embeddable/executable. That's not really PHP code. It's a plain PHP string that happens to contain characters that end up LOOKING like php code.
e.g.
<?php
echo '<?php echo "foo"; ?>';
doesn't output just "foo". You get the literal characters ', <, ?, p, etc... as the output. That internal echo foo business is not PHP code in this context. It's a PHP string that contains characters that would be PHP code if it wasn't inside the ' quotes.
If you want to catch all the PHP tags in a file, you could loop through them, then run a preg_replace to pattern match the tags and remove them.
A quick example for regex could be http://regexr.com/3cu2t

PHP regex replace between wordpress shortcode tag

I have a shortcode which I want to be able to strip away depending on the context of the post. Eg.
[tooltip slug="test"]Test Text[/tooltip]
I would like the output to be:
<span class="dummy">Test Text</span>
I have experimented (a lot!) with preg_replace and I can't seem to get it to recognize that the replacement string is between the ']' and then delimited by '[/tooltip]' without doing multiple passes.
Ideas?
Update: As so often happens, about 10 seconds after I wrote this one of my attempts seemed to work. I don't think it's as good as the solution below but FWIW...
$my_var .= preg_replace('/(?:\[tooltip slug=\"([^\"]*)"[^\>]*\]([^\<]*)\[\/tooltip\])/', '<span class="dummy">\\2</span>', $my_post->post_content);
Here is the simple regex you are looking for.
$result = preg_replace('%\[tooltip slug="[^"]*"]([^[]*)\[/tooltip]%',
'<span class="dummy">\1</span>', $subject);
What we do here is capture the text between the tooltip tags, and insert it in the replacement.
Let me know if you need any details.
$test = preg_match('/\[([^\]]+)\]([^\[]+)\[/', '[tooltip slug="test"]Test Text[/tooltip]', $matches);
echo $matches[2];

PHP multiline preg_replace to extract portion of a HTML document

I am trying to parse a HTTP document to extract portions of the document, but am unable to get the desired results. Here is what I have got:
<?php
// a sample of HTTP document that I am trying to parse
$http_response = <<<'EOT'
<dl><dt>Server Version: Apache</dt>
<dt>Server Built: Apr 4 2010 17:19:54
</dt></dl><hr /><dl>
<dt>Current Time: Wednesday, 10-Oct-2012 06:14:05 MST</dt>
</dl>
I do not need anything below this, including this line itself
......
EOT;
echo $http_response;
echo '********************';
$count = -1;
$a = preg_replace("/(Server Version)([\s\S]*?)(MST)/", "$1$2$3", $http_response, -1, $count);
echo "<br> count: $count" . '<br>';
echo $a;
I still see the string "I do not need ..." in the output. I do not need that string. What am I doing wrong?
How do I easily remove all other HTML tags as well?
Thanks for your help.
-Amit
You are matching everything from Server Version until MST. And only the part that is matched will later be modified by preg_replace. Everything not covered by the regex remains untouched.
So to replace the string part before your first anchor, and the text following, you also must match them first.
= preg_replace("/^.*(Server Version)(.*?)(MST).*$/s", "$1$2$3",
See the ^.* and .*$. Both will be matched, but aren't mentioned in the replacement pattern; so they get dropped.
Also of course, might be simpler to just use preg_match() in such cases ...
You need to capture other caracters after / before your regex, like :
/.+?(Server Version)([\s\S]*?)(MST).+?/s
The 's' is a flag telling preg to match multiple lines, you'll need it.
To remove html tags, use strip_tags.

How to make this linking making script behave with markdown?

I'm using PHP markdown but I also need a script to convert plaintext links into clicakable ones. Both work independently, but when I try to run them together, if I run markdown first, the makelinks still processes on the html code and screws things up.. and.. vice versa. Any idea of how to stop it from doing that? I can't figure out regex to ignore the markdown style links
function makeLinks($text) {
$text = preg_replace('%(((f|ht){1}tp://)[-a-zA-^Z0-9#:\%_\+.~#?&//=]+)%i',
'\\1', $text);
$text = preg_replace('%([[:space:]()[{}])(www.[-a-zA-Z0-9#:\%_\+.~#?&//=]+)%i',
'\\1\\2', $text);
return $text;
}
sample text:
###[Title Section](http://domain/folder/page.html)
- Blah blah some text and then a link: www.webpage.org.
The double-linkify problem can be solved best with guesswork and workarounds. (We have some duplicate questions, but I can never find a good one..)
Since already converted http://-urls only occur right after href=" or an >, you can use those for negative assertions.
(?<!href="|>)
Should be written at the start of your first regex:
$text = preg_replace('%(?<!href="|>)(((f|ht){1}tp://)...
Your second regex uses the :space: as anchor, so should be fault tolerant already.

PHP - Removing <?php ?> tags from a string

What's the best way to remove these tags from a string, to prepare it for being passed to eval() ?
for eg. the string can be something like this:
<?php
echo 'hello world';
?>
Hello Again
<?php
echo 'Bye';
?>
Obviously str_replace won't work because the two php tags in the middle need to be there (the the 1st and the last need to be removed)
Usually, you wouldn't want to pass a function to eval.
If you're wishing to just remove the tags, string_replace would do the job just fine, however you might be better off using a regex.
preg_replace(array('/<(\?|\%)\=?(php)?/', '/(\%|\?)>/'), array('',''), $str);
This covers old-asp tags, php short-tags, php echo tags, and normal php tags.
Sounds like a bad idea, but if you want the start and end ones removed you could do
$removedPhpWrap = preg_replace('/^<\?php(.*)(\?>)?$/s', '$1', $phpCode);
This should do it (not tested).
Please tell me also why you want to do it, I'm 95% sure there is a better way.
You could do:
eval("?> $string_to_be_evald <?");
which would close the implicit tags and make everything work.
There's no need to use regex; PHP provides functions for removing characters from the beginning or end of a string. ltrim removes characters from the beginning of the string, rtrim from the end of the string, and trim from both ends.
$code = trim ( $code );
$code = ltrim( $code, '<?php' );
$code = rtrim( $code, '?>' );
The first trim() removes any leading or trailing spaces that are present in the siting. If we omitted this and there was whitespace outside of the PHP tags in the string, then the ltrim and rtrim commands would fail to remove the PHP tags.

Categories