Creating pattern to php preg_match_all - php

I'm using preg_match_all() and my problem is that I can't create the pattern that I want. Example of source text:
<td align='left'>
<span style='font-size: 13px; font-family: Verdana;'><span>
</td>
<td>
<a style='color: #ffff00' rel='gb_page_fs[]' title='Parodyk kitiems 8 seriją' href='/pasidalink-19577x10/'>
<img src="/templates/filmai_black/images/ico_tool_share.gif" />
</a>
</td>
<td>
<small>LT titrai</small>
</td>
<td>
<a rel='gb_page_center[528, 290]' title='Žiūrėti 8 seriją' href='http://www.filmai.in/watch.php?em=BuwgzpqtssiAGGcjeekz9PTI1NjQ0N2E~'>
<img src="/templates/filmai_black/images/play_icon.png" width="20" onclick='set_watched_cookie_serial("19577x10", "done-tick-full-series")' />
</a>
</td>
I am using the pattern:
<td><small>(.*)</small></td>
<td><a rel='gb_page_center[528, 290]' title='Žiūrėti (.*) seriją' href='(.*)'><img src=
I want to get the content in the (.*) location into an array.
Can someone please correct my pattern and explain it?
I want to learn to use regular expressions.

"Don't use Regex to parse HTML" aside,
here are a few uber simple steps to learning Regexp.
Download and install RegexBuddy
Run RegexBuddy
Start with something easy and then FLY! :)
the expression you are looking for is:
<small>(.*)</small>
It finds all characters found inbetween small tags and puts them into backreferences.
Think of Backreference as an Array. The first to item found, is 0, next is 1 and so on.
// command:
preg_match_all('%<small>(.*)</small>%i', $subject, $result, PREG_PATTERN_ORDER);
// $result[0]
Array
(
[0] => <small>LT titrai</small>
)

Related

How to convert Wordpress caption tag to html div tag

I converted a webiste from Wordpress and I some of the posts have a caption tag as the following:
[caption id="attachment_666" align="alignleft" width="316"]
<img class="wp-image-92692" src="img" width="316" alt="fitbit-yoga-lady.png" height="210">
text
[/caption]
I would like to catch all of these captions and convert it to the following
<div id="attachment_666" style="width: 326px" class="wp-caption alignleft">
<img class="wp-image-92692" src="img" alt="fitbit-yoga-lady.png" width="316" height="210">
<p class="caption">text</p>
</div>
Well, given the exact text that you provided, the following should work.
Search Pattern:
\[caption([^\]]+)align="([^"]+)"\s+width="(\d+)"\](\s*\<img[^>]+>)\s*(.*?)\s*\[\/caption\]
Replacement:
<div\1style="width: \3px" class="wp-caption \2">\4
<p class="caption">\5</p>
</div>
See the demo.
Depending on how tolerant of variations in the input it needs to be, you may need to adjust it from there, but that should at least get you started.
Here's an example of how this could be done with preg_replace:
function convert_caption($content)
{
return preg_replace(
'/\[caption([^\]]+)align="([^"]+)"\s+width="(\d+)"\](\s*\<img[^>]+>)\s*(.*?)\s*\[\/caption\]/i',
'<div\1style="width: \3px" class="wp-caption \2">\4<p class="caption">\5</p></div>',
$content);
}
I'm doing this blindly on my phone, but I think you can use the following two regular expressions, one for the opening tag and another for the closing:
Find:
\[caption([^\]])\]
Replace:
<div$1>
Find:
\[/\caption\]
Replace:
</div>

Finding the instance of a specific string and first 2 letters of next string and switch their places in php

I have a specific string that is:
<span style="color: green;">
I want a function that finds the instance of this string and then further finds whether the next string's first 2 characters are
I have the idea of searching it character by character, but it would take too long. Is there any shorter solution to it?
Input:
ab <span style="color: green;"> </strong>
Output:
ab </strong> <span style="color: green;">
The strong tag is just for an example, it could be /b, /i, /li or any other closing tag.
You can use preg_replace for this, i.e.:
$myHtml = <<< LOL
ab <span style="color: green;"> </strong>
LOL;
$myHtml = preg_replace('%(<span style="color: green;">)(?:\s+)?(</.*?>)%i', '$2 $1', $myHtml);
echo $myHtml;
//ab </strong> <span style="color: green;">
It will work with any tag that comes after the span.
DEMO:
http://sandbox.onlinephpfunctions.com/code/9dc934ece66856a92b041114140982dc822a6bec

How to get content from a div using regex

I have string like :
<div class="fck_detail">
<table align="center" border="0" cellpadding="3" cellspacing="0" class="tplCaption" width="1">
<tbody>
<tr><td>
<img alt="nole-1375196668_500x0.jpg" src="http://l.f1.img.vnexpress.net/2013/07/30/nole-1375196668_500x0.jpg" width="500">
</td></tr>
<tr><td class="Image">
Djokovic hậm hực với các đàn anh. Ảnh: <em>Livetennisguide.</em>
</td></tr>
</tbody>
</table>
<p>Riêng với Andy Murray, ...</p>
<p style="text-align:right;"><strong>Anh Hào</strong></p>
</div>
I want to get content . How to write this pattern using preg_match. Please help me
If there are no other HTML tags inside the div, then this regex should work:
$v = '<div class="fck_detail">Some content here</div>';
$regex = '#<div class="fck_detail">([^<]*)</div>#';
preg_match($regex, $v, $matches);
echo $matches[1];
The actual regex here is <div class="fck_detail">([^<]*)</div>. Regexes used in PHP also need to be surrounded by some other character that doesn't occur in the regex (I used #).
However, if what you're parsing is arbitrary HTML provided by the user, then preg_match simply can't do this. Full-fledged HTML parsing is beyond the ability of any regex, and that's what you'll need if you're parsing the output of a full-fledged HTML editor.

Idiomatic HTML generation in PHP

In a separate question here on StackOverflow (PHP library for HTML tag generation) I asked if there is a popular or standard HTML tag library for PHP.
A couple of comments showed up questioning the purpose of such a library.
Here's a bit of code from the highly acclaimed book "PHP and MySQL Web Development 4th Edition" by Luke Welling and Laura Thomson:
echo "<td width = \"".$width."%\">
<a href=\"".$url."\">
<img src=\"s-logo.gif\" alt=\"".$name."\" border=\"0\" /></a>
<span class=\"menu\">".$name."</span>
</td>";
I thought all the escaping and concatenating looked a little messy, so I cooked up an HTML generation library. The above looks like this using the library:
return td(array('width' => $width . '%'),
a(array('href' => $url),
img(array('src' => 's-logo.gif', 'alt' => $name, 'border' => 0))),
a(array('href' => $url), span(array('class' => 'menu'), $name)));
My question is (and keep in mind, I'm a php newb), what's the idiomatic way to write the above? Is there a cleaner way to write the book example?
You can use PHP heredoc syntax
<?php
$width=10;
$url="www.google.com";
$name="stackoverflow";
echo <<<EOT
<td width = "$width">
<a href="$url">
<img src="s-logo.gif" alt="$name" border="0" /></a>
<span class="menu">$name</span>
</td>
EOT;
?>
For more information refer Php Manual
Cleaner - definitely, with heredoc:
echo <<<HTML
<td width="{$width}%">
<img src="s-logo.gif" alt="{$name}" border="0" />
<span class="menu">{$name}</span>
</td>
HTML;

Preg_replace cant replace what i need

I need some help with preg_replace. At my website I need to delete some things I don't need.
Here is an example:
[caption id="attachment_100951" align="alignleft" width="448" caption="THIS IS WHAT I NEED"] [/caption]
Ok, all I need from this string is: the text inside caption="THIS TEXT",
every thing else I need to be deleted, I have used Google and tried some examples but nothing.
Maybe I need to use another function, but from what I have read on the internet this should replace .
Please help me, it's very important.
Thank you.
EDIT:
The code has some other things that i have forgot.
[caption id="attachment_100951" align="alignleft" width="448" caption="eaaaaaaaaaaaaaaaaaa"]
<a href="http://localhost/111baneease1.jpg">
<img class="size-full wp-image-100951" title="zjarr_banese1" src="http://localhost/111baneease1.jpg" alt="eeeeeeeeeeeeeeeee" width="448" height="308" />
</a>
[/caption]
So i need to get the CAPTION text and delete every thing in
[caption id="attachment_100951" align="alignleft" width="448" caption="eaaaaaaaaaaaaaaaaaa"]
but not
<a href="http://localhost/111baneease1.jpg">
<img class="size-full wp-image-100951" title="zjarr_banese1" src="http://localhost/111baneease1.jpg" alt="eeeeeeeeeeeeeeeee" width="448" height="308" />
</a>
also want delete [/caption]
too.
This regex :
$result = preg_replace('/\[caption\s+.*?caption\s*=(["\'])(.*?)\1.*?\[\/caption\]/', '$2', $subject);
will output :
THIS IS WHAT I NEED
when applied to :
[caption id="attachment_100951" align="alignleft" width="448" caption="THIS IS WHAT I NEED"] [/caption]
Updated answer based on your updated question :
$result = preg_replace('%\[caption\s+.*?caption\s*=(["\'])(.*?)\1\s*\](.*?)\[/caption\]%s', '$2\n$3', $subject);
The above regex applied to :
[caption id="attachment_100951" align="alignleft" width="448" caption="eaaaaaaaaaaaaaaaaaa"]
<a href="http://localhost/111baneease1.jpg">
<img class="size-full wp-image-100951" title="zjarr_banese1" src="http://localhost/111baneease1.jpg" alt="eeeeeeeeeeeeeeeee" width="448" height="308" />
</a>
[/caption]
Will output :
eaaaaaaaaaaaaaaaaaa
<a href="http://localhost/111baneease1.jpg">
<img class="size-full wp-image-100951" title="zjarr_banese1" src="http://localhost/111baneease1.jpg" alt="eeeeeeeeeeeeeeeee" width="448" height="308" />
</a>
I am not sure if this is exactly what you wanted. Of course you can use the regex to match and do whatever you want with groups $2 and $3...

Categories