Hi I have placeholder text in my content from the CMS like this:
$content = "blah blah blah.. yadda yadda, listen to this:
{mediafile file=audiofile7.mp3}
and whilst your here , check this: {mediafile file=audiofile24.mp3}"
and i need to replace the placeholders with some html to display the swf object to play the mp3.
How do i do a replace that gets the filename from my placeholder.
I think the regx pattern is {mediafile file=[A-Za-z0-9_]} but then how do i apply that to the whole variable containing the markers?
Thanks very much to anyone that can help,
Will
Here is a quick example, using preg_replace_all, to show how it works :
if $content is declared this way :
$content = "blah blah blah.. {mediafile file=img.jpg}yadda yadda, listen to this:
{mediafile file=audiofile7.mp3}
and whilst your here , check this: {mediafile file=audiofile24.mp3}";
You can replace the placeholders with something like this :
$new_content = preg_replace_callback('/\{mediafile(.*?)\}/', 'my_callback', $content);
var_dump($new_content);
And the callback function might look like this :
function my_callback($matches) {
$file_full = trim($matches[1]);
var_dump($file_full); // string 'file=audiofile7.mp3' (length=19)
// or string 'file=audiofile24.mp3' (length=20)
$file = str_replace('file=', '', $file_full);
var_dump($file); // audiofile7.mp3 or audiofile24.mp3
if (substr($file, -4) == '.mp3') {
return '<SWF TAG FOR #' . htmlspecialchars($file) . '#>';
} else if (substr($file, -4) == '.jpg') {
return '<img src="' . htmlspecialchars($file) . '" />';
}
}
Here, the last var_dump will get you :
string 'blah blah blah.. <img src="img.jpg" />yadda yadda, listen to this:
<SWF TAG FOR #audiofile7.mp3#>
and whilst your here , check this: <SWF TAG FOR #audiofile24.mp3#>' (length=164)
Hope this helps :-)
Don't forget to add checks and all that, of course ! And your callback function will most certainly become a bit more complicated ^^ but this should give you an idea of what is possible.
BTW : you might want to use create_function to create an anonymous function... But I don't like that : you've got to escape stuff, there is no syntax-highlighting in the IDE, ... It's hell with a big/complex function.
I originally was thinking you could use a function that involved json_decode, but strings need to be wrapped in quotes or json_decode doesn't handle them. So if your placeholders were written:
{"mediafile" : "file" : "blahblah.mp3"}
you could change my sample code from using explode($song) to json_decode($song, true) and have a nice keyed array to work with.
Either way, I went with using the strtok function to find the placeholders, and then a basic string replace function to change the instances of the found placeholders into html, which is just gibberish.
strtok, so far as PHP docs indicate, does not use regex, so this would be not only simpler but also avoid a call to the preg library.
One last thing. If you do go with json syntax, you will have to re-wrap the placeholders in{} as strtok removes the tokens it is searching by.
<?php
$content = "blah blah blah.. yadda yadda, listen to this:
{mediafile file=audiofile7.mp3}
and whilst your here , check this: {mediafile file=audiofile24.mp3}";
function song2html($song) {
$song_info = explode("=", $song);
$song_url = $song_info[1];
$song_html = "<object src=\"$song_url\" blahblahblah>blah</object>";
return ($song_html);
}
$tok = strtok($content, "{}");
while ($tok !== false) {
if(strpos($tok, "mediafile") !== false) {
$songs[] = $tok;
}
$tok = strtok("{}");
}
foreach($songs as $asong) {
$content = str_replace($asong, song2html($asong), $content);
}
echo $content;
?>
Read the regex docs carefully.
your pattern looks a little off. {mediafile file=([^}]+)} might be ore like what you're looking for (the regex you gave doesn't allow for ".").
you do something like that
$content = preg_replace_callback(
'|{mediafile file='([A-Za-z0-9_.]+)}|',
create_function(
// single quotes are essential here,
// or alternative escape all $ as \$
'$matches',
'return "<embed etc ... " . ($matches[1]) ."more tags";'
),
$content
);
you can see the manual of preg_replace_callback. Normal preg_replace also work but might be a messy.
Related
I'm trying to get rid of php code in a file using regex. Some of the php is not well-formatted, so that there may be extra spaces and/or line breaks. As an example:
<?php require_once('some_sort_of_file.php');
?>
I've come up with the following regex which seems to work:
$initial_text = preg_replace('/\s+/', ' ', $initial_text );
$initial_text = preg_replace('/' . preg_quote('<?php') . '.*?' . preg_quote('?>') . '/', '', $initial_text);
but was wondering if there might be a way to just use 1 regex statement, in order to speed things up.
Thanks!
An even better way to do it: use the built-in tokenizer. Regexes have problems with parsing irregular languages like PHP. The tokenizer, on the other hand, parses PHP code just like PHP itself does.
Sample code:
// some dummy code to play with
$myhtml = '<html>
<body>foo bar
<?php echo "hello world"; ?>
baz
</body>
</html>';
// Our own little function to do the heavy lifting
function strip_php($text) {
// break the code into tokens
$tokens = token_get_all($text);
// loop over the tokens
foreach($tokens as $index => $token) {
// If the token is not an array (e.g., ';') or if it is not inline HTML, nuke it.
if(!is_array($token) || token_name($token[0]) !== 'T_INLINE_HTML') {
unset($tokens[$index]);
}
else { // otherwise, echo it or do whatever you want here
echo $token[1];
}
}
}
strip_php($myhtml);
Output:
<html>
<body>foo bar
baz
</body>
</html>
DEMO
you can put it as a single regex using the s modifier which will allow the dot to match newline chars too. I added the i modifier too to make it case-insensitive.. dunno if you care about that:
$initial_text = preg_replace('~<\?php.*?\?>~si', '', $initial_text );
I'm trying to write a code library for my own personal use and I'm trying to come up with a solution to linkify URLs and mail links. I was originally going to go with a regex statement to transform URLs and mail addresses to links but was worried about covering all the bases. So my current thinking is perhaps use some kind of tag system like this:
l:www.google.com becomes http://www.google.com and where m:john.doe#domain.com becomes john.doe#domain.com.
What do you think of this solution and can you assist with the expression? (REGEX is not my strong point). Any help would be appreciated.
Maybe some regex like this :
$content = "l:www.google.com some text m:john.doe#domain.com some text";
$pattern = '/([a-z])\:([^\s]+)/'; // One caracter followed by ':' and everything who goes next to the ':' which is not a space or tab
if (preg_match_all($pattern, $content, $results))
{
foreach ($results[0] as $key => $result)
{
// $result is the whole matched expression like 'l:www.google.com'
$letter = $results[1][$key];
$content = $results[2][$key];
echo $letter . ' ' . $content . '<br/>';
// You can put str_replace here
}
}
This regex is used to replace text links with a clickable anchor tag.
#(?<!href="|">)((?:https?|ftp|nntp)://[^\s<>()]+)#i
My problem is, I don't want it to change links that are in things like <iframe src="http//... or <embed src="http://...
I tried checking for a whitespace character before it by adding \s, but that didn't work.
Or - it appears they're first checking that an href=" doesn't already exist (?) - maybe I can check for the other things too?
Any thoughts / explanations how I would do this is greatly appreciated. Main, I just need the regex - I can implement in CakePHP myself.
The actual code comes from CakePHP's Text->autoLink():
function autoLinkUrls($text, $htmlOptions = array()) {
$options = var_export($htmlOptions, true);
$text = preg_replace_callback('#(?<!href="|">)((?:https?|ftp|nntp)://[^\s<>()]+)#i', create_function('$matches',
'$Html = new HtmlHelper(); $Html->tags = $Html->loadConfig(); return $Html->link($matches[0], $matches[0],' . $options . ');'), $text);
return preg_replace_callback('#(?<!href="|">)(?<!http://|https://|ftp://|nntp://)(www\.[^\n\%\ <]+[^<\n\%\,\.\ <])(?<!\))#i',
create_function('$matches', '$Html = new HtmlHelper(); $Html->tags = $Html->loadConfig(); return $Html->link($matches[0], "http://" . $matches[0],' . $options . ');'), $text);
}
You can expand the lookbehind at the beginning of those regexes to check for src=" as well as href=", like this:
(?<!href="|src="|">)
What regular expression should I use to detect is the text I want to hyperlink had been already hyperlinked.
Example:
I use
$text = preg_replace('/((http)+(s)?:\/\/[^<>\s]+)/i', '\\0', $text);
to link regular URL, and
$text = preg_replace('/[#]+([A-Za-z_0-9]+)/', '#\\1', $text);
to link Twitter handle.
I want to detect whether or not the text I'm going to hyperlink had been wrapped in already.
Maybe not an answer but another possible solution; You could also search to see if the starting a element exists
$text = 'here';
if (gettype(strpos($text, "<a")) == "integer"){
//<a start tag was found
};
or just strip all tags regardless and build the link anyway
$text = 'here';
echo '' . strip_tags($text) . '';
Simple, replace the regular URLs first, as it won't affect anything starting with an # cause no URL starts with an #. Then replace the twitter handles.
That way you don't need to detect if it's been hyperlinked already.
if (strpos($str, '<a ') !== FALSE) echo 'ok';
else echo 'error';
$html = 'Stephen Ou';
$str = 'Stephen Ou';
if (strlen(str_replace($str, '', $html)) !== strlen($html)) {
echo 'I got a feeling';
}
its a little bit hard to understand.
in the header.php i have this code:
<?
$ID = $link;
$url = downloadLink($ID);
?>
I get the ID with this Variable $link --> 12345678
and with $url i get the full link from the functions.php
in the functions.php i have this snippet
function downloadlink ($d_id)
{
$res = #get_url ('' . 'http://www.example.com/' . $d_id . '/go.html');
$re = explode ('<iframe', $res);
$re = explode ('src="', $re[1]);
$re = explode ('"', $re[1]);
$url = $re[0];
return $url;
}
and normally it prints the url out.. but, i cant understand the code..
It's written in kind of a strange way, but basically what downloadLink() does is this:
Download the HTML from http://www.example.com/<ID>/go.html
Take the HTML, and split it at every point where the string <iframe occurs.
Now take everything that came after the first <iframe in the HTML, and split it at every point where the string src=" appears.
Now take everything after the first src=" and split it at every point where " appears.
Return whatever was before the first ".
So it's a pretty poor way of doing it, but effectively it looks for the first occurence of this in the HTML code:
<iframe src="<something>"
And returns the <something>.
Edit: a different method, as requested in comment:
There's not really any particular "right" way to do it, but a fairly straightforward way would be to change it to this:
function downloadlink ($d_id)
{
$html = #get_url ('' . 'http://www.example.com/' . $d_id . '/go.html');
preg_match('/\<iframe src="(.+?)"/', $html, $matches);
return $matches[1];
}