SimpleXML won't return data properly - php

I'm using SimpleXML to parse a data file from an external source. I'm trying to pull a thumbnail from the result, which looks like this:
<entry>
<title>Ball_Punch</title>
<author>
<name>burningcandle2010</name>
<uri>https://www.mochimedia.com/community/profile/burningcandle2010</uri>
</author>
<link href="http://www.mochimedia.com/games/allout-offsite" rel="alternate" />
<link href="http://games.mochiads.com/c/g/allout-offsite/Ball_Punch.swf" rel="enclosure" type="application/x-shockwave-flash" />
<id>urn:uuid:bf720e45-7ca0-34c7-a63a-0f6f20a4c267</id>
<media:player height="470" url="http://games.mochiads.com/c/g/allout-offsite/Ball_Punch.swf" width="798" />
<media:thumbnail height="100" url="http://thumbs.mochiads.com/c/g/allout-offsite/_thumb_100x100.png" width="100" />
<media:title>Ball_Punch</media:title>
<media:description>punch your ball</media:description>
<media:keywords>other, rhythm</media:keywords>
<category term="Puzzles" />
<updated>2010-07-04T08:13:22.571963-08:00</updated>
<published>2010-07-04T06:58:55.577826-08:00</published>
<summary type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<a href="http://www.mochimedia.com/games/allout-offsite">
<img class="thumbnail" src="http://thumbs.mochiads.com/c/g/allout-offsite/_thumb_100x100.png" />
</a>
<dl>
<dt>Tag</dt>
<dd class="tag">cdb41e529fbe39bd</dd>
<dt>Description</dt>
<dd class="description">punch your ball</dd>
<dt>Resolution</dt>
<dd class="resolution">798x470</dd>
<dt>Instructions</dt>
<dd class="instructions" />
<dt>Key Mappings</dt>
<dd class="key_mappings" />
<dt>Control Scheme</dt>
<dd class="control_scheme">{"fire": "left_mouse", "jump": "space", "movement": "mouse"}</dd>
<dt>Categories</dt>
<dd class="categories">Puzzles</dd>
<dt>Keywords</dt>
<dd class="keywords">other, rhythm</dd>
<dt>Rating</dt>
<dd class="rating">Everyone</dd>
<dt>Leaderboards</dt>
<dd class="leaderboards">False</dd>
<dt>Embed</dt>
<dd>
<code class="embed"><embed src="http://games.mochiads.com/c/g/allout-offsite/Ball_Punch.swf" menu="false" quality="high" width="798" height="470" type="application/x-shockwave-flash" pluginspage="http://www.macromedia.com/go/getflashplayer" /></code>
</dd>
<dt>Slug</dt>
<dd class="slug">allout-offsite</dd>
<dt>Featured</dt>
<dd class="recommended">False</dd>
<dt>Zip File</dt>
<dd class="zip_url">http://games.mochiads.com/c/g/allout-offsite.zip</dd>
<dt>SWF file size</dt>
<dd class="swf_file_size">184374</dd>
</dl>
</div>
</summary>
</entry>
My code is here:
$thumbnail = $game->summary->div->a->img->attributes()->src;
However, when I run this through print_r($thumbnail), I get:
SimpleXMLElement Object
(
[0] => DATA_I_WANT
)
No matter what I do, it always winds up being this or an empty SimpleXMLElement Object. I've tried ->src[0], ->src->{'0'}, etc. to no avail.

Try simply typecasting to a string:
print_r((string) $thumbnail)
SimpleXML has a habit of being a little shady when you try to actually use the data that it collects. I'm in a habit of just typecasting everything it spits out.
Good luck!

Just cast to a string. :]
$thumbnail = (string)$game->summary->div->a->img->attributes()->src;

Related

How to parse XML's <media:text type="html"> with PHP

I would be happy if there was someone who can tell me how to decode the following string from XML to PHP:
<media:text type="html">
<p>
<a href="foo.com">
<img src="foo.com/foo.jpg" align="left" alt="Foo title" title="Foo title" border="0" />
</a>
</p>
</media:text>
which is part of the following item:
<item>
<title>Foo title</title>
<description>Foo Description</description>
<link>foo.com</link>
<pubDate>Tue, 02 Feb 2021 18:23:51 EST</pubDate>
<media:content url="foo.com/foo.jpg" />
**<media:text type="html">
<p>
<a href="foo.com">
<img src="foo.com/foo.jpg" align="left" alt="Foo title" title="Foo title" border="0" />
</a>
</p>
</media:text>**
</item>
With the code portion
$ content = $ xml-> channel-> item [$ i] -> children ('media', True) -> content-> attributes ();
I can only value content but I can't extract
<media: text type = "html">
Thanks to those who can help me!
You can use the SimpleXMLElement function to parse your XML, you will receive an array which will be easily parsed.
See https://www.php.net/manual/fr/simplexml.examples-basic.php.

str_replace bug with foreach loop

I've got a problem with my little text to smiley/emoji function in php, which is based on str_replace. This is my code.
$smileys = array( ":inlove:" => "/smileys/smiley2.png",
":cool:" => "/smileys/smiley3.png",
":tongue:" => "/smileys/smiley4.png",
":wow:" => "/smileys/smiley5.png",
":smile:" => "/smileys/smiley15.png",
":happy:" => "/smileys/smiley6.png",
":funny:" => "/smileys/smiley7.png",
":wink:" => "/smileys/smiley8.png",
":worried:" => "/smileys/smiley10.png",
":pokerface:" => "/smileys/smiley9.png",
":poop:" => "/smileys/smiley12.png",
":thinking:" => "/smileys/35_thinking.png",
":triumph:" => "/smileys/49_triumph.png",
":vulcan:" => "/smileys/109_vulcan.png",
":pointup:" => "/smileys/102_point_up_2.png",
":santa:" => "/smileys/135_santa.png",
":spy:" => "/smileys/134_spy.png");
if(isset($_POST['message'])) {
$messagePlain = $_POST['message'];
$messageSmileys = $messagePlain;
foreach($smileys as $key => $img) {
$messageSmileys = str_replace($key, '<img src="' . $img . '" />', $messageSmileys);
}
$connection->query(// Message to db);
}
It works fine. But the problem is, when the user inputs more than ~14 emojis in a row, the HTML gets destroyed and it looks like this:
And the HTML source like this:
<div class="media-body">
<h4 class="media-heading">test <small>07. August. 2017 01:34</small></h4>
<img src="/smileys/smiley2.png" /> <img src="/smileys/smiley3.png" /> <img src="/smileys/smiley5.png" /> <img src="/smileys/smiley4.png" /> <img src="/smileys/smiley15.png" /> <img src="/smileys/smiley6.png" /> <img src="/smileys/smiley7.png" /> <img src="/smileys/smiley8.png" /> <img src="/smileys/smiley10.png" /> <img src="/smileys/smiley9.png" /> <img src="/smileys/smiley12.png" /> <img src="/smileys/49_triumph.png" /> <img src="/smileys/109_vulcan.png" /> <img src="/smileys/102_point_up_2.p </div>
</div>
Could someone help me with this? Why is the HTML tag for <img> suddenly destroyed?
Looking at your output:
<img src="/smileys/smiley2.png" /> <img src="/smileys/smiley3.png" /> <img src="/smileys/smiley5.png" /> <img src="/smileys/smiley4.png" /> <img src="/smileys/smiley15.png" /> <img src="/smileys/smiley6.png" /> <img src="/smileys/smiley7.png" /> <img src="/smileys/smiley8.png" /> <img src="/smileys/smiley10.png" /> <img src="/smileys/smiley9.png" /> <img src="/smileys/smiley12.png" /> <img src="/smileys/49_triumph.png" /> <img src="/smileys/109_vulcan.png" /> <img src="/smileys/102_point_up_2.p
The above string's length is 499 characters. I strongly believe that your message field in the database table is limited to 500 characters or something and the output is truncated to those bits.
Solution / Suggestion
If you are using MySQL Database Server, I would strongly recommend you changing the database type from VARCHAR(500) to TEXT or LONGTEXT.
Your database field probably has a length limit. You might want to raise that, but there’s a more important initial fix: store the original text in the database and perform the replacements on the output instead.
This is especially important because it looks like you’re probably vulnerable to HTML injection right now. Make sure to run htmlspecialchars before doing the replacement and be aware of how you need to encode your output to make it safe. How to prevent XSS with HTML/PHP? might be a good start.

php preg_replace href attribute using conditions

hello I have the following tags :
$content ='<a href="http://website.com/" />
<a href="/link1" />
<a href="https://website.com" />
<a href="link1" />';
and this code :
preg_replace('~href=(\'|"|)(.*?)(\'|"|)(?<!\/|http:\/\/|https:\/\/)~i', 'href=$1http://website2.com/$2$3', $content);
I want to use the code above to replace href tags doesn't start with an http or https or with a slash . thanks in advance.
Something like this should do it. I'd advise using a parser in the future, How do you parse and process HTML/XML in PHP?, for tasks such as this though. This can become very messy quickly. Here's a link on this regex usage as well, http://www.rexegg.com/regex-best-trick.html.
Regex:
/href=("|')https?:\/\/(*SKIP)(*FAIL)|href=("|')(.*?)\2/
Demo: https://regex101.com/r/cV2xB5/1
PHP Usage:
$content ='<a href="http://website.com/" />
<a href="/link1" />
<a href="https://website.com" />
<a href="link1" />';
echo preg_replace('/href=("|\')https?:\/\/(*SKIP)(*FAIL)|href=("|\')\/?(.*?)\2/', 'href=$2http://website2.com/$3$2', $content);
Output:
<a href="http://website.com/" />
<a href="http://website2.com/link1" />
<a href="https://website.com" />
<a href="http://website2.com/link1" />
Update, for // exclusion
Use:
href=("|')(?:https?:)?\/\/(*SKIP)(*FAIL)|href=("|')(.*?)\2
Demo: https://regex101.com/r/cV2xB5/2
PHP:
$content ='<a href="//website.com/" />
<a href="/link1" />
<a href="https://website.com" />
<a href="link1" />';
echo preg_replace('/href=("|\')(?:https?:)?\/\/(*SKIP)(*FAIL)|href=("|\')\/?(.*?)\2/', 'href=$2http://website2.com/$3$2', $content);
Output:
<a href="//website.com/" />
<a href="http://website2.com/link1" />
<a href="https://website.com" />
<a href="http://website2.com/link1" />

How to keep object tag when parsing HTML Content use SimpleHTMLDom

I want to keep all content from div with class content(keep all original html format include object flash video). But when i use SimpleHTMLDom to parsing i don't get all content that i want .
//File test.html
<div class="content">
<div style="text-align:center;">
<object id="fpt_player_3548_0" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=9,0,0,0" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" height="296" width="470"><param value="http://st.f2.vnecdn.net/f/v37/fptplayer_embed.swf" name="movie"><param value="high" name="quality"><param value="transparent" name="wmode"><param value="xmlPath=http://vnexpress.net/video/vne-info/id/3548/type/2&colorAux=0x0099ff&colorBorder=0x333333&colorMain=0xffffff&local=embed&mAuto=false&autoHide=false&trackurl=&tracktype=video" name="flashvars"><param value="true" name="allowfullscreen"><param value="always" name="allowScriptAccess"><embed id="fpt_player_3548_0" name="fpt_player_3548_0" src="http://st.f2.vnecdn.net/f/v37/fptplayer_embed.swf" pluginspage="http://www.macromedia.com/go/getflashplayer" wmode="transparent" allowscriptaccess="always" allowfullscreen="true" type="application/x-shockwave-flash" flashvars="xmlPath=http://vnexpress.net/video/vne-info/id/3548/type/2&colorAux=0x0099ff&colorBorder=0x333333&colorMain=0xffffff&local=embed&mAuto=false&autoHide=false&trackurl=&tracktype=video" height="296" width="470">
</object>
</div>
<p class="Normal">
<em>Something</em>
More contents
</p>
</div>
Code to parsing HTML using SimpleHTMLDom
$html = file_get_html('test.html');
// find all div tags with class=content
foreach($html->find('div.content') as $e)
echo $e->innertext . '<br>';
This output i have (object tag hidden)
<div style="text-align:center;">
<div id="video-3548" data-component="true" data-component-type="video" data-component-value="3548" data-component-typevideo="2" style="display:none;">
</div>
</div>
<p class="Normal">
<em>Something</em>
More contents
</p>

XPATH expression to select all elements who do not cotain a certain parent

I am using XPATH in PHP. Assume I have the following HTML
<div>
<a>
<img id='img1' src='src1' />
</a>
<div>
<img id='img2' src='src2' />
</div>
<img id='img3' src='src3' />
<div>
<a>
<img id='img4' src='src4' />
</a>
<div>
<a>
<div>
<img id='img5' src='src5' />
</div>
<a>
</div>
I would like to write an XPATH expression that will select all <img> tags, that do not have <a> tags as direct parents.
So in the example above, only img2, img3, and img5 should result from that expression's selection.
//img[not(parent::a)]
Give that a try.

Categories