Convert HTML output to a different structure with PHP DOM - php

I have a simple HTML file that I need to read some values from and change the structure of for HTML email output. I'm fairly new to scripting/PHP/navigating the DOM, so forgive me if this is a simple question.
Below is the initial output:
<table id="Table_01" width="600" height="547" border="0" cellpadding="0" cellspacing="0">
<tr>
<td colspan="2">
<img src="header.jpg" width="600" height="295" border="0" alt="Alt Text 1"></td>
</tr>
<tr>
<td>
<a href="http://url.com/1">
<img src="leftcell_link1.jpg" width="300" height="163" border="0" alt="Alt Text Left"></a></td>
<td>
<a href="http://url.com/2">
<img src="rightcell_link2.jpg" width="300" height="163" border="0" alt="Alt Text Right"></a></td>
</tr>
<tr>
<td colspan="2">
<a href="http://url.com/3">
<img src="body_link3.jpg" width="600" height="89" border="0" alt="Body Alt"></a></td>
</tr>
</table>
Here is the desired output:
<table id="Table_01" width="100%" border="0" cellpadding="0" cellspacing="0">
<tr>
<td colspan="2" width="100%">
<img src="header.jpg" border="0" alt="Alt Text 1"></td>
</tr>
<tr>
<td width="50%">
<a href="http://url.com/1" name="link1">
<img src="leftcell_link1.jpg" border="0" alt="Alt Text Left" name="link1"></a></td>
<td width="50%">
<a href="http://url.com/2" name="link2">
<img src="rightcell_link2.jpg" border="0" alt="Alt Text Right" name="link2"></a></td>
</tr>
<tr>
<td colspan="2" width="100%">
<a href="http://url.com/3" name="link3">
<img src="body_link3.jpg" border="0" alt="Body Alt" name="link3"></a></td>
</tr>
</table>
Some things of note
The structure of the input file will not always be the same.
The "td" widths, which are based off a percentage of the width attribute of the child (or grandchild) "img" node compared to the total email width (in this case, 600px).
Attaching the custom "name" attribute to the "a" and "img" tags based off a substring of the image "src" attribute.
Would I be best to deconstruct the entire thing into an array of the required element attributes then reconstruct it in the correct format? Or would it be easier to loop through the DOM and look for the attributes I need then apply them to the parents and delete unnecessary attributes where needed?
And is there any way to handle this all recursively so that I don't need multiple levels of checks based on whether it's at the "td" "a" or "img" level?

You can change what you want with DOMDocument class.
<?php
$doc = new DOMDocument();
$doc->loadHTML('<table id="Table_01" width="600" height="547" border="0" cellpadding="0" cellspacing="0"> <tr> <td colspan="2"> <img src="header.jpg" width="600" height="295" border="0" alt="Alt Text 1"></td> </tr> <tr> <td> <img src="leftcell_link1.jpg" width="300" height="163" border="0" alt="Alt Text Left"></td> <td> <img src="rightcell_link2.jpg" width="300" height="163" border="0" alt="Alt Text Right"></td> </tr> <tr> <td colspan="2"> <img src="body_link3.jpg" width="600" height="89" border="0" alt="Body Alt"></td> </tr> </table>');
$tds = $doc->getElementsByTagName('td');
$tds[0]->setAttribute('width', '100%');
$tds[1]->setAttribute('width', '50%');
$tds[2]->setAttribute('width', '100%');
var_dump($doc->saveHTML());
?>
result:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
...
<td colspan="2" width="100%">
...
<td width="50%">
...
<td width="100%">
...
</html>
Please read documentation of this functions:
http://php.net/manual/en/class.domdocument.php

Related

Resize width and height of images

I have content stored in mysql, as following:
<table width="450" cellspacing="1" cellpadding="1" border="1">
<tbody>
<tr>
<td><img width="513" height="680" align="left" alt=" src="/userfiles/image/pic.jpg" /></td>
<td><img width="315" height="700" align="left" alt=" src="/userfiles/image/DSC_0389.JPG" /></td>
<td><img width="580" height="320" align="left" alt=" src="/userfiles/image/ktxh.jpg" /></td>
</tr>
</tbody>
</table>
When I load from db, PHP and display in html by PHP, there is no problem.
Now, I want all images, be displayed by fixed width and height as 200 X 200 AND TABLE BORDER = '0'
<table width="200" cellspacing="1" cellpadding="1" border="0">
<tbody>
<tr>
<td><img width="200" height="200" align="left" alt=" src="/userfiles/image/pic.jpg" /></td>
<td><img width="200" height="200" align="left" alt=" src="/userfiles/image/DSC_0389.JPG" /></td>
<td><img width="200" height="200" align="left" alt=" src="/userfiles/image/ktxh.jpg" /></td>
</tr>
</tbody>
</table>
How do I solve this problem?
Replace with this code and css
<style>
.cstm
{
width:200px;
height:200px
}
table
{
border:0
}
</style>
<table width="200" cellspacing="1" cellpadding="1" border="0">
<tbody>
<tr>
<td><img class="cstm" align="left" alt=" src="/userfiles/image/pic.jpg" /></td>
<td><img class="cstm" align="left" alt=" src="/userfiles/image/DSC_0389.JPG" /></td>
<td><img class="cstm" align="left" alt=" src="/userfiles/image/ktxh.jpg" /></td>
</tr>
</tbody>
</table>
As you mention the $content is dynamic, can you try adding the style to the table as such:
<style>
table td>img
{
width:200px;
height:200px
}
</style>

HTML Purifier for webmail

I'm working on small webmail client. For safely embedding html I want to use HTML Purifier (BTW: it's a good idea?).
I checked it with several emails and some problems. One email (from Google) is having something like this:
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="4%">
<td width="92%" style="padding-top:18px; padding-bottom:10px; opacity:0.7">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tbody>
<td width="30%">
<img style="display:inline-block;" height="26" src="https://www.gstatic.com/local/guides/email/images/photo-impact/googlelogo_light_clr-f040d5d9.png">
<td>
<td width="70%" style="text-align:right">
</td>
</tbody>
</table>
Converts to:
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="4%">
</td><td width="92%" style="padding-top:18px;padding-bottom:10px;opacity:.7;">
</td><td width="30%">
<img style="display:inline-block;" height="26" src="https://www.gstatic.com/local/guides/email/images/photo-impact/googlelogo_light_clr-f040d5d9.png" alt="googlelogo_light_clr-f040d5d9.png">
</td><td>
</td><td width="70%" style="text-align:right;">
</td>
</tr></table>
I don't know why it remove second <table> tag (also it close wrong <td> and removes <tbody>). Is it possible to change HTML Purifier to make it work for those situations?

modify a <DIV> element with CURL

I know how to provide or modify a field with CURL curl_setopt function and to work with cookies, but what I could not find is how to modify an HTML element with CURL.
For instance, let's say I want to replace the function "submit_device_form" in below code
<div id="bottombuttons">
<table border="0" cellspacing="0" cellpadding="0" style="width:100%;">
<!-- buttons -->
<tr>
<td> </td>
<td class="separatorButton">
<table border="0" cellspacing="0" cellpadding="0">
<tr>
<td>Set</td>
<td>Cancel</td>
</tr>
</table>
</td>
</tr>
<!-- fixes the width of the first column -->
<tr>
<td style="padding:0;"><img src="images/spacer.gif" alt="" width="200" height="1" border="0" ></td>
<td style="padding:0;"><img src="images/spacer.gif" alt="" width="360" height="1" border="0" ></td>
</tr>
</table>
</div>
with "submit_MY_form", a new JavaScript function included in my PHP script. How would I go for it?
I looked into the "PHP Simple HTML DOM Parser" library since it provides the setAttribute and removeAttribute functions that would allow me to do that easily, but then I miss the info on how to give it access to the page read wia CURL, so I'm stuck.
Thanks for your help,
Robert

Cleaning Emails collected from IMAP for insertion to a database

I'm in the process of creating a script for our internal customer support system. I want to collect emails from our IMAP inbox (hosted on Gmail) and parse the emails into the database.
What is the best way to clean frames, badly coded tags, and messy formatting so the result is a clean text with minimal formatting?
I'm aware Regular Expressions will most likely play heavily, but I want to know if this functionality exists in another library somewhere that I'm missing.
Edit: More specifically what needs removed:
All inline CSS/Styling, All HTML except simple formatting like Bold, Underline, and Italics.
Here's an email I'm using as a test case, It's a fairly beefy spam email I got from ZoneAlarm, It's got a bit of everything.
<td>
<br>
<br>
<table align="center" bgcolor="#749FD0" border="0" cellpadding="0" cellspacing="0" style="font-family:Arial,Helvetica,sans-serif;font-size:12px;line-height:16px;color:#555555" valign="top" width="700">
<tbody>
<tr>
<td>
<table align="center" border="0" cellpadding="0" cellspacing="0" valign="top" width="680">
<tbody>
<tr>
<td height="10">
<img border="0" height="1" src="http://download.zonealarm.com/bin/images/email/socialguard/spacer.gif" style="display: block; max-width: 2880px;" width="1"></td>
</tr>
</tbody>
</table>
<table align="center" border="0" cellpadding="0" cellspacing="0" valign="top" width="680">
<tbody>
<tr>
<td height="10" width="10">
<img border="0" height="10" src="http://www.zonealarm.com/email/campaigns/2013/2013_06_SummerSale/nw.png" style="display: block; max-width: 2880px;" width="10"></td>
<td bgcolor="#E3ECEC" height="10" width="660">
<img alt="ZoneAlarm by Check Point Software Technologies LTD." border="0" src="http://www.zonealarm.com/email/campaigns/2013/2013_05_MemorialDay/za_transparent.png" width="120" style="display: block; max-width: 2880px;" title="ZoneAlarm by Check Point Software Technologies LTD."></td>
<td align="right" style="font-family:Arial,Helvetica,sans-serif" width="150">
<span style="color:#999999;font-size:12px">Connect with ZoneAlarm</span></td>
<td align="right" valign="middle" width="125">
<img alt="ZoneAlarm Facebook" border="0" src="http://www.zonealarm.com/email/campaigns/2013/2013_05_MemorialDay/facebook.png" width="22" title="ZoneAlarm Facebook" style="max-width: 2880px;"> <img alt="ZoneAlarm Twitter" border="0" width="22" src="http://www.zonealarm.com/email/campaigns/2013/2013_05_MemorialDay/twitter.png" title="ZoneAlarm Twitter" style="max-width: 2880px;"> <img alt="ZoneAlarm YouTube" border="0" src="http://www.zonealarm.com/email/campaigns/2013/2013_05_MemorialDay/youtube.png" title="ZoneAlarm YouTube" height="22" style="max-width: 2880px;"><img border="0" height="15" src="http://download.zonealarm.com/bin/images/email/socialguard/spacer.gif" width="10" style="max-width: 2880px;"></td>
<td bgcolor="#E3ECEC" rowspan="6" align="center" valign="top" width="1">
<img align="right" height="32" src="http://download.zonealarm.com/bin/images/emails/welcome/borderx1.png" width="1" style="max-width: 2880px;">
</td>
</tr>
</tbody>
</table>
<table align="center" border="0" cellpadding="0" cellspacing="0" valign="top" width="680">
<tbody>
<tr>
<td height="10" width="10">
<img border="0" height="10" src="http://www.zonealarm.com/email/campaigns/2013/2013_06_SummerSale/sw.png" style="display: block; max-width: 2880px;" width="10"></td>
<td bgcolor="#E3ECEC" height="10" width="660">
You can use HTML Purifier for this, see: http://htmlpurifier.org/

put link on table row or any other way to put link

i have one php code for button creation
for($i=1;$i<=$n;$i++)
{
$row=mysql_fetch_array($result);
if($row['btn_color']==1)
$btbg="side-button5.png";
if($row['btn_color']==2)
$btbg="side-button6.png";
if($row['btn_color']==3)
$btbg="side-button7.png";
if($row['btn_color']==4)
?>
<br>
<table width="200" height="50" border="0" cellpadding="0" cellspacing="0">
<tr>
<td background="images/<?php echo $btbg ; ?>" style="background-repeat:no-repeat"><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td height="66">
<div align="center" class="buttonside">
<p>
<a class="buttonside" href="vpa.php?pgid=<?php echo $row['page_id']; ?>">
<?php echo $row['btn_text']?></p>
</a>
</div>
</td>
</tr>
</table>
</td>
</tr>
</table>
<?php
}
?>
this code is working fine but the link is on text, i want to put link on full button(background)
Thanks
To make the button 'linkable', you'll need to wrap the <a> tag around it.
However, you're going to need to change your HTML markup structure - you can't wrap an anchor around a table cell!

Categories