Find and replace a piece of code in HTML - php

I have a template with predefined blocks. It looks like below:
<body>
<div class="row">
<div class="col-lg-6">
<include name="position1" type="module">
</div>
<div class="col-lg-6">
<include name="position2" type="module">
</div>
</div>
<div class="row">
<div class="col-lg-10">
<include type="content">
</div>
<div class="col-lg-2">
<include name="right" type="module">
</div>
</div>
</body>
My predefined blocks:
-position1
-position2
-right
I'm looking for the way to get the blocks above as array from my page;
And small function to replace block with another content by block name;
For example:
function replace_block('position1', '<b>Replaced content</b>') {
... code
And as output:
...
<div class="col-lg-6">
<b>Replaced content</b>
</div>
...
}
Thanks!

Try this:
<?php
echo replaceBlock('position1', 'this is a replacement string', $html);
function replaceBlock($name,$replacement,$html) {
$pattern = '/(<include name="' . $name . '".*?>)/';
if (preg_match($pattern, $html, $matches)) {
$html = str_replace($matches[1],$replacement,$html);
}
return $html;
}

Related

Get content inside nested div class using Goute PHP

Sorry for bad english.
So i want to scrap some content from the website, but the div classes are nested and confusing me.
Basically the structure is :
<div id="gsc_vcd_table">
<div class="gs_scl">
<div class="gsc_vcd_field">
Pengarang
</div>
<div class="gsc_vcd_value">
I Anggara Wijaya, Djoko Budiyanto Setyohadi
</div>
</div>
<div class="gs_scl">
<div class="gsc_vcd_field">
Tanggal Terbit
</div>
<div class="gsc_vcd_value">
2017/3/1
</div>
</div>
</div>
I want to get text I Anggara Wijaya, Djoko Budiyanto Setyohadi from Pengarang field and also get 2017/3/1 from Tanggal Terbit field.
$crawlerdetail=$client->request('GET',$detail);
$detailscholar=$crawlerdetail->filter('div.gsc_vcd_table');
foreach ($detailscholar as $key)
{
$keyCrawler=new Crawler($key);
$pengarang=($scCrawler->filter('div.gsc_vcd_value')->count()) ? $scCrawler->filter('div.gsc_vcd_value')->text() : '';
echo $pengarang;
}
Help me please.
If you want to use SimpleXMLElement class.
See this code:
<?php
$string = <<<XML
<div id="gsc_vcd_table">
<div class="gs_scl">
<div class="gsc_vcd_field">
Pengarang
</div>
<div class="gsc_vcd_value">
I Anggara Wijaya, Djoko Budiyanto Setyohadi
</div>
</div>
<div class="gs_scl">
<div class="gsc_vcd_field">
Tanggal Terbit
</div>
<div class="gsc_vcd_value">
2017/3/1
</div>
</div>
</div>
XML;
$xml = new SimpleXMLElement($string);
$result1 = $xml->xpath("//div[contains(#class, 'gsc_vcd_field')]");
$result2 = $xml->xpath("//div[contains(#class, 'gsc_vcd_value')]");
foreach ($result1 as $key => $node) {
echo "FIELD: $result1[$key] , VALUE: $result2[$key]<br>\n";
}
And also for get xpath pattern of any elements, you can use inspect in chrome, and Copy XPath.
Another solution is use preg_match_all, see:
preg_match_all('/<div class="gsc_vcd_field">\r\n(.*?)\r\n.*<\/div>\r\n.*<div class="gsc_vcd_value">\r\n(.*?)\r\n.*<\/div>/', $string, $matches);
foreach ($matches[1] as $key => $match) {
echo "FIELD: " . $matches[1][$key] . " , VALUE: " . $matches[2][$key] . "<br>\n";
}

str_replace work incorrect ("str_replace" makes changes to the $replace parameter)

Good day!
Specify, for whatever reason, if the number is greater than 10, then str_replace() makes changes to the $replace parameter, cutting units and leaving only dozens?
Input data ($data):
...
<div onclick="window.location.href='/template-04.php?type=users&char=7';"></div>
<div onclick="window.location.href='/template-04.php?type=users&char=8';"></div>
<div onclick="window.location.href='/template-04.php?type=users&char=9';"></div>
<div onclick="window.location.href='/template-04.php?type=users&char=10';"></div>
<div onclick="window.location.href='/template-04.php?type=users&char=11';"></div>
<div onclick="window.location.href='/template-04.php?type=users&char=12';"></div>
...
very simple PHP code:
for($axx = 0; $axx < 68; $axx ++)
{
$z = '['.$axx.']';
$newName = 'templ4-user-'.$z.'.html?'.$z;
echo '<br>'.$newName; // echo (axx = 13): <br>templ4-user-[13].html?[13]
$data = str_replace('template-04.php?type=users&char='.$axx, $newName, $data);
}
Result $data incorrect. (if $axx > 10) Why?
...
<div onclick="window.location.href='/templ4-user-[7].html?[7]';"></div>
<div onclick="window.location.href='/templ4-user-[8].html?[8]';"></div>
<div onclick="window.location.href='/templ4-user-[9].html?[9]';"></div>
<div onclick="window.location.href='/templ4-user-[1].html?[1]0';"></div> <------ !!!!!!!
<div onclick="window.location.href='/templ4-user-[1].html?[1]1';"></div> <------ !!!!!!!
<div onclick="window.location.href='/templ4-user-[1].html?[1]2';"></div>
<div onclick="window.location.href='/templ4-user-[1].html?[1]3';"></div>
<div onclick="window.location.href='/templ4-user-[1].html?[1]4';"></div>
<div onclick="window.location.href='/templ4-user-[1].html?[1]5';"></div>
<div onclick="window.location.href='/templ4-user-[1].html?[1]6';"></div>
<div onclick="window.location.href='/templ4-user-[1].html?[1]7';"></div>
<div onclick="window.location.href='/templ4-user-[1].html?[1]8';"></div>
<div onclick="window.location.href='/templ4-user-[1].html?[1]9';"></div>
<div onclick="window.location.href='/templ4-user-[2].html?[2]0';"></div>
...
Please help.
It is because in first iteration all 1's will become [1]'s which means that 12 will become [1]2 and will never match agains 12 anymore.
Instead of loops, you could use preg_replace :
$data = <<<EOS
<div onclick="window.location.href='/template-04.php?type=users&char=7';"></div>
<div onclick="window.location.href='/template-04.php?type=users&char=8';"></div>
<div onclick="window.location.href='/template-04.php?type=users&char=9';"></div>
<div onclick="window.location.href='/template-04.php?type=users&char=10';"></div>
<div onclick="window.location.href='/template-04.php?type=users&char=11';"></div>
<div onclick="window.location.href='/template-04.php?type=users&char=12';"></div>
EOS;
$pattern = '/template-04.php\?type=users&char=(\d+)/i';
$replacement = 'templ4-user-[$1].html?[$1]';
echo preg_replace($pattern, $replacement, $data);
Result:
<div onclick="window.location.href='/templ4-user-[7].html?[7]';"></div>
<div onclick="window.location.href='/templ4-user-[8].html?[8]';"></div>
<div onclick="window.location.href='/templ4-user-[9].html?[9]';"></div>
<div onclick="window.location.href='/templ4-user-[10].html?[10]';"></div>
<div onclick="window.location.href='/templ4-user-[11].html?[11]';"></div>
<div onclick="window.location.href='/templ4-user-[12].html?[12]';"></div>

get content between tags include inner tags

I have the content:
<html>
<body>
<div class="another div">
other content
</div>
<div class="fck_detail width_common">
<p class="Normal">
Some text 1.
</p>
<p class="Normal">
Some text 2.
</p>
<div style="text-align:center;">
<div class="embed-container">
<div id="video-18574" data-component="true" data-component-type="video" data-component-value="18574" data-component-typevideo="2"></div>
</div>
</div>
<p class="Normal">
Some text 3.
</p>
<p class="Normal">
Some text 4.
</p>
</div>
</body>
</html>
I use the function below to get content of 'div class="fck_detail width_common"'
function get_content_by_tag($content, $tag_and_more, $include_tag = true){
$p = stripos($content,$tag_and_more,0);
if($p==false) return "";
$content=substr($content,$p);
$p = stripos($content," ",0);
if(abs($p)==0) return "";
$open_tag = substr($content,0,$p);
$close_tag = substr($open_tag,0,1)."/".substr($open_tag,1).">";
$count_inner_tag = 0;
$p_open_inner_tag = 1;
$p_close_inner_tag = 0;
$count=1;
do{
$p_open_inner_tag = stripos($content,$open_tag,$p_open_inner_tag);
$p_close_inner_tag = stripos($content,$close_tag,$p_close_inner_tag);
$count++;
if($p_close_inner_tag!=false) $p = $p_close_inner_tag;
if($p_open_inner_tag!=false){
if(abs($p_open_inner_tag)<abs($p_close_inner_tag)){
$count_inner_tag++;
$p_open_inner_tag++;
}else{
$count_inner_tag--;
$p_close_inner_tag++;
}
}else{
$count_inner_tag--;
if($p_close_inner_tag>0) $p_close_inner_tag++;
}
}while($count_inner_tag>0);
if($include_tag)
return substr($content,0,$p+strlen($close_tag));
else{
$content = substr($content,0,$p);
$p = stripos($content,">",0);
return substr($content,$p+1);
}
}
then I try
echo get_content_by_tag($content, '<div class="fck_detail width_common">');
It only return:
<div class="fck_detail width_common">
<p class="Normal">
Some text 1.
</p>
<p class="Normal">
Some text 2.
</p>
<div style="text-align:center;">
<div class="embed-container">
<div id="video-18574" data-component="true" data-component-type="video" data-component-value="18574" data-component-typevideo="2"></div>
</div>
</div>
missing the DIV with content "some text 3" and "some text 4"
who can tell me what's wrong?
One way to do this is through PHP Simple HTML DOM Parser
$str = '
<html>
<body>
<div class="another div">
other content
</div>
<div class="fck_detail width_common">
<p class="Normal">
Some text 1.
</p>
<p class="Normal">
Some text 2.
</p>
<div style="text-align:center;">
<div class="embed-container">
<div id="video-18574" data-component="true" data-component- type="video" data-component-value="18574" data-component-typevideo="2"></div>
</div>
</div>
<p class="Normal">
Some text 3.
</p>
<p class="Normal">
Some text 4.
</p>
</div>
</body>
</html>
';
$html = str_get_html($str);
echo $html->find("div[class='fck_detail width_common']",0)->innertext;
Try using this library : http://sourceforge.net/projects/simplehtmldom/
You can get your data in the following way
$url="www.yoururl.html";
$html = new simple_html_dom();
$html = file_get_html($url);
$data = $html->find('.fck_detail',0);
or
$html = str_get_html($str);
$data = $html->find("div[class='fck_detail width_common']",0)->innertext;

How do I loop through multiple child nodes of XML?

I am having some trouble trying to loop through an XML document. The XML looks like this:
<data>
<weather>
<hourly>
<time>0</time>
<tempC>17</tempC>
<tempF>62</tempF>
<windspeedMiles>24</windspeedMiles>
<windspeedKmph>39</windspeedKmph>
</hourly>
<hourly>
<time>3</time>
<tempC>16</tempC>
<tempF>60</tempF>
<windspeedMiles>22</windspeedMiles>
<windspeedKmph>35</windspeedKmph>
</hourly>
</weather>
<weather>
<hourly>
<time>0</time>
<tempC>17</tempC>
<tempF>62</tempF>
<windspeedMiles>24</windspeedMiles>
<windspeedKmph>39</windspeedKmph>
</hourly>
<hourly>
<time>3</time>
<tempC>16</tempC>
<tempF>60</tempF>
<windspeedMiles>22</windspeedMiles>
<windspeedKmph>35</windspeedKmph>
</hourly>
</weather>
</data>
My code (below) whilst it loops through all 'weather' nodes, it only picks out the first 'hourly' child node and completely skips the second. Would someone be able to help me as if I am honest, I do not know enough about looping to fix it and its driving me nuts! Grr.
Here is my PHP code which loads an XML document from online and then formats the XML results into div tags and obviously loops through the XML but as I said only loops through the first 'hourly' node of each 'weather' node.
<?php
// load SimpleXML
$data = new SimpleXMLElement('myOnlineXMLdocument.xml', null, true);
echo <<<EOF
<div class="observationRow">
<div class="observationTitleSmall"><br>Time</div>
<div class="observationTitleSmall"><br>Temp C</div>
<div class="observationTitleSmall"><br>Temp F</div>
<div class="observationTitleSmall"><br>Wind Speed MPH</div>
<div class="observationTitleSmall"><br>Wind Speed KMPH</div>
</div>
EOF;
foreach($data as $weather) // loop through our hours
{
echo <<<EOF
<div>
<div class="observationCellSmall"><br>{$weather->time}</div>
<div class="observationCellSmall"><br>{$weather->tempC}</div>
<div class="observationCellSmall"><br>{$weather->tempF}</div>
<div class="observationCellSmall"><br>{$weather->hourly->windspeedMiles}</div>
<div class="observationCellSmall"><br>{$weather->hourly->windspeedKmph}</div>
EOF;
}
echo '</div>';
?>
EDITED CODE:
$str = "";
foreach($data->weather as $weather)
{
foreach ($weather->hourly as $hour)
{
$str .= "
<div>";
if ($hour->time == "0") {
$str .= "
<div class='observationCellSmall'><br>$weather->date</div>
<div class='observationCellSmall'><br>$weather->maxtempC</div>
<div class='observationCellSmall'><br>$weather->mintempC</div>";
}
$str .= "
<div class='observationCellSmall'><br>$hour->time</div>
<div class='observationCellSmall'><br>$hour->tempC</div>
<div class='observationCellSmall'><br>$hour->tempF</div>
<div class='observationCellSmall'><br>$hour->windspeedMiles</div>
<div class='observationCellSmall'><br>$hour->windspeedKmph</div>
</div>
";
}
}
echo $str;
Using a slenderized version of your XML feed, that generates this:
<div>
<div class='observationCellSmall'><br>2013-08-19</div>
<div class='observationCellSmall'><br>17</div>
<div class='observationCellSmall'><br>15</div>
<div class='observationCellSmall'><br>0</div>
<div class='observationCellSmall'><br>15</div>
<div class='observationCellSmall'><br>59</div>
<div class='observationCellSmall'><br>11</div>
<div class='observationCellSmall'><br>18</div>
</div>
<div>
<div class='observationCellSmall'><br>300</div>
<div class='observationCellSmall'><br>15</div>
<div class='observationCellSmall'><br>59</div>
<div class='observationCellSmall'><br>13</div>
<div class='observationCellSmall'><br>21</div>
</div>
<div>
<div class='observationCellSmall'><br>2013-08-20</div>
<div class='observationCellSmall'><br>21</div>
<div class='observationCellSmall'><br>16</div>
<div class='observationCellSmall'><br>0</div>
<div class='observationCellSmall'><br>17</div>
<div class='observationCellSmall'><br>62</div>
<div class='observationCellSmall'><br>11</div>
<div class='observationCellSmall'><br>18</div>
</div>
<div>
<div class='observationCellSmall'><br>300</div>
<div class='observationCellSmall'><br>16</div>
<div class='observationCellSmall'><br>61</div>
<div class='observationCellSmall'><br>10</div>
<div class='observationCellSmall'><br>17</div>
</div>
You need a nested loop. One to loop over the weathers, and and another to loop over the hourlies.
foreach($data->weather as $weather) {
foreach($weather->hourly as $hourly) {
// code here
}
}
I don't remember the simplexml API 100% off my head, if that doesn't work you might need to use ->getChildren() or something to make it iterable.
Either that, or use xpath and nab the hourlies directly: /data/weather/hourly.

php changing div using DOMDocument but doesn't update page

I'm not sure what I'm doing wrong but I'm getting the right nodeValue for what I want. It's just not updating when the php script is done. Here's the code:
$dom = new DOMDocument();
//suppress HTML5 and other errors
libxml_use_internal_errors(true);
$dom->loadHTMLFile($pageURL);
libxml_use_internal_errors(false);
$xpath = new DOMXPath($dom);
$divContent = $xpath->query("//*[#id='resultStats']/p")->item(0);
$newText = new DOMText("100 results");
var_dump($divContent->nodeValue); //returns old test value "400 results" which is correct
$divContent->removeChild($divContent->firstChild);
$divContent->appendChild($newText);
var_dump($divContent->localName); //"p" because i got it from <p> in resultStats
var_dump($divContent->textContent); //"100 results"
var_dump($divContent->nodeValue); //"100 results"
more of the HTML that is around it
<div class="container">
<div class="row">
<div class="resultStats span3 offset1" id="resultStats">
<p>400 results found.</p>
</div>
</div>
<div class="row">
<div class="span12">
<div class="row">
<div class="span6 offset1">
<?php
if (isset($_POST['q'])) {
//code from above that is executing every time from tests
}
?>
</div>
<div class="span5">
span5
</div>
</div>
</div>
</div>
I'm not sure what I'm doing wrong. If I do dom->save it rewrites everything (even php code) so I don't think that's a good idea.
I don't understand why you're using DOMDocument for this. Can't you just do this:
<div class="container">
<div class="row">
<div class="resultStats span3 offset1" id="resultStats">
<?php
// get new result count somehow in $resultCount
echo '<p>'.$resultCount.' results found</p>';
?>
</div>
</div>

Categories