XPath : Parsing a page - php

Let's say I have this HTML:
<div class="area">Area One</div>
<div class="key">AAA</div>
<div class="value">BBB</div>
<div class="key">CCC</div>
<div class="value">DDD</div>
<div class="key">EEE</div>
<div class="value">FFF</div>
<div class="area">Area Two</div>
I want to use XPath to make an array:
my_array['area']
[0] =>
['AAA'] => "BBB"
['CCC'] => "DDD"
['EEE'] => "FFF"
[1] => ...
And so on. Any thoughts on how this can be accomplished? What I'm trying to do is use "area" as the marker between sub-arrays.

My knowledge is somewhat limited in PHP but you can try :
<?php
$html = <<<'HTML'
<div class="area">Area One</div>
<div class="key">AAA</div>
<div class="value">BBB</div>
<div class="key">CCC</div>
<div class="value">DDD</div>
<div class="key">EEE</div>
<div class="value">FFF</div>
<div class="area">Area Two</div>
<div class="key">GGG</div>
<div class="value">HHH</div>
<div class="key">III</div>
<div class="value">JJJ</div>
<div class="key">KKK</div>
<div class="value">LLL</div>
HTML;
$document = new DOMDocument();
$document->loadHTML($html);
$xpath = new DOMXpath($document);
$nbarea = count($xpath->query('//*[contains(text(),"Area")]'));
$i=1;
$j=1;
for ($a = 1; $a <= $nbarea; $a++) {
for ($b = 1; $b <= 3; $b++) {
$element1 = $xpath->query('//*[contains(text(),"Area")]['.$i.']/following::div['.$j.']');
$j++;
$element2 = $xpath->query('//*[contains(text(),"Area")]['.$i.']/following::div['.$j.']');
$h1 = $element1->item(0)->nodeValue;
$h2 = $element2->item(0)->nodeValue;
$area[$i-1][$h1] = $h2;
$j++;
}
$i++;
$j=1;
}
print_r($area)
?>
Output :
Array
(
[0] => Array
(
[AAA] => BBB
[CCC] => DDD
[EEE] => FFF
)
[1] => Array
(
[GGG] => HHH
[III] => JJJ
[KKK] => LLL
)
)
Side note : I've assumed you always have the same number of elements for each area (=3).

Related

php xquery parsing html

how to parse nested html tags like this structure:
<article class="tile">
<div class="tile-content">
ignore
<div class="tile-content__text tile-content__text--arrow-white">
<label class="label-date label-date--blue">01.12.2021</label>
<h4><a class="link-color-black" href="link-1">title-1</a></h4>
<p class="tile-content__paragraph tile-content__paragraph--gray pd-ver-10">​
content-1
</p>
</div>
more
</div>
<article class="tile">
<div class="tile-content">
ignore
<div class="tile-content__text tile-content__text--arrow-white">
<label class="label-date label-date--blue">02.12.2021</label>
<h4><a class="link-color-black" href="link-2">title-2</a></h4>
<p class="tile-content__paragraph tile-content__paragraph--gray pd-ver-10">​
content-2
</p>
</div>
more
</div>
</article>
to array like:
$parsedArray = [
0 =>
['title => 'title',
'link' => 'link-1',
'date' => '2021-12-01',
'content' => 'content-1']
1 =>
['title => 'title-2',
'link' => 'link-2',
'date' => '2021-12-02',
'content' => 'content-2']
,....]
i use xquery like above, but this remove all tags, after that i have only implode text from all tags, i need to extract info from all tags, any tip?
$dom = new DOMDocument();
$dom->loadHTML($html['html']);
$xpath = new DOMXPath($dom);
$nodelist = $xpath->query("//article[contains(#class, 'tile')]");
foreach ($nodelist as $n) {
echo '<pre>';
var_dump($n);
echo '</pre>';
}
var_dump won't parse the DOM :)
You just need to re-query for your elements within the tile, then assign them to the array.
Assign a working item array to define the structure if it matters, else just build up the result as you go.
<?php
$str = '<article class="tile">
<div class="tile-content">
ignore
<div class="tile-content__text tile-content__text--arrow-white">
<label class="label-date label-date--blue">02.12.2021</label>
<h4><a class="link-color-black" href="link-2">title-2</a></h4>
<p class="tile-content__paragraph tile-content__paragraph--gray pd-ver-10">
content-2
</p>
</div>
more
</div>
</article>';
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHtml($str);
libxml_clear_errors();
$xpath = new DOMXPath($dom);
$result = [];
foreach ($xpath->query("//article[contains(#class, 'tile')]") as $tile) {
// define item structure
$item = [
'title' => '',
'link' => '',
'date' => '',
'content' => ''
];
// find date
$query = $xpath->query("//label[contains(#class, 'label-date')][1]", $tile);
if (count($query)) {
$item['date'] = $query[0]->nodeValue;
}
// find link/title
$query = $xpath->query("//h4/a[1]", $tile);
if (count($query)) {
$item['link'] = $query[0]->getAttribute('href');
$item['title'] = $query[0]->nodeValue;
}
// find content
$query = $xpath->query("//p[contains(#class, 'tile-content__paragraph')][1]", $tile);
if (count($query)) {
$item['content'] = $query[0]->nodeValue;
}
// assign
$result[] = $item;
// cleanup
unset($item, $query);
}
print_r($result);
Output:
Array
(
[0] => Array
(
[title] => title-2
[link] => link-2
[date] => 02.12.2021
[content] =>
content-2
)
)

Foreach not showing all items in multidimensional array

This is my first question in a long time, any help is greatly appreciated!
I've got one database storing vehicles and one database storing their images. I am using an INNER JOIN to grab a list of vehicles and their images. After the database query, I put the return into an array; so 2 arrays in 1 array:
return array($vehicles, $vehicle_images);
When I do a print_r I get the correct return:
<?php print_r($your_listings[0]); ?>
<br />
<?php print_r($your_listings[1]); ?>
Returns:
Array
(
[0] => Array
(
[vehicle_id] => 35
[vehicle_type] => jeep
[vehicle_vin] => 6969
[owner_email] => user#user.com
[vehicle_make] => Jeep
[vehicle_year] => 2008
[vehicle_model] => cherokee
)
[1] => Array
(
[vehicle_id] => 36
[vehicle_type] => motorcycle
[vehicle_vin] => 1234
[owner_email] => user#user.com
[vehicle_make] => honda
[vehicle_year] => 2018
[vehicle_model] => random
)
[2] => Array
(
[vehicle_id] => 39
[vehicle_type] => atv
[vehicle_vin] => 3215
[owner_email] => user#user.com
[vehicle_make] => Yamaha
[vehicle_year] => 1990
[vehicle_model] => OHYEA
)
)
Array
(
[0] => Array
(
[vehicle_id] => 35
[image_display] => placeholder
)
[1] => Array
(
[vehicle_id] => 36
[image_display] => /new/images/vehicles/users/42/image.jpg
)
[2] => Array
(
[vehicle_id] => 36
[image_display] => /new/images/vehicles/users/42/vehicle1.jpg
)
[3] => Array
(
[vehicle_id] => 35
[image_display] => /new/images/vehicles/users/42/vehicle.jpg
)
[4] => Array
(
[vehicle_id] => 39
[image_display] => placeholder
)
)
Now when I do a foreach (including bootstrap 4 styling), it only shows 2 vehicles instead of 3; the 2 vehicles it shows appear to be showing exactly as I want them:
<div class="container-fluid">
<div class="row no-gutters">
<?php
$your_listings = owner_listings($_SESSION['user']);
if (!($your_listings[0])) {
echo '<div class="col-sm"><div class="alert alert-danger" role="alert"><i class="fas fa-exclamation"></i> You do not have any listings active at this time.</div></div>';
}
else {
foreach ($your_listings as $i => $item) {
$make = $your_listings[0][$i]['vehicle_make'];
$model = $your_listings[0][$i]['vehicle_model'];
$year = $your_listings[0][$i]['vehicle_year'];
$vehicle = $your_listings[0][$i]['vehicle_id'];
$image = $your_listings[1][$i]['image_display'];
if ($image != 'placeholder') {
echo '<div class="col-sm"><div class="card" style="width: 18rem;">
<h5 class="card-title text-center font-weight-bold">'.$year.' '.$make.' '.$model.'</h5>
<img class="card-img-top" src="'.$image.'" alt="'.$year.' '.$make.' '.$model.'">
<div class="card-body">
Edit
</div>
</div></div>';
}
else {
if ($your_listings[0][$i]['vehicle_type'] == 'atv') {
$image = '/new/images/vehicles/types/atv.png';
}
elseif ($your_listings[0][$i]['vehicle_type'] == 'jeep') {
$image = '/new/images/vehicles/types/jeep.png';
}
elseif ($your_listings[0][$i]['vehicle_type'] == 'motorcycle') {
$image = '/new/images/vehicles/types/motorchycle.png';
}
echo '<div class="col-sm"><div class="card" style="width: 18rem;">
<h5 class="card-title text-center font-weight-bold">'.$year.' '.$make.' '.$model.'</h5>
<img class="card-img-top" src="'.$image.'" alt="'.$year.' '.$make.' '.$model.'">
<div class="card-body">
Edit
</div>
</div></div>';
}
}
}
?>
</div>
</div>
Have I just been staring at this too long? What am I doing wrong? Any help is appreciated.
Thanks!
You are looping original array which has two arrays as you said. What you want is to loop through only first element of your_listings array to get three vehicles
if (!($your_listings[0])) {
echo '<div class="col-sm"><div class="alert alert-danger" role="alert"><i class="fas fa-exclamation"></i> You do not have any listings active at this time.</div></div>';
}
else {
foreach ($your_listings as $i => $item) { // should be foreach ($your_listings[0] as $i => $item) {
$make = $item['vehicle_make'];
$model = $item['vehicle_model'];
Give a try to this answer...
<div class="container-fluid">
<div class="row no-gutters">
<?php
$your_listings = owner_listings($_SESSION['user']);
if (!($your_listings[0]))
{
echo '<div class="col-sm"><div class="alert alert-danger" role="alert"><i class="fas fa-exclamation"></i> You do not have any listings active at this time.</div></div>';
}
else
{
$newarray = array();
foreach($your_listings[0] as $i => $item)
{
$newarray[$item["vehicle_id"]] = $item["image_display"];
}
foreach ($your_listings[0] as $i => $item)
{
$make = $item['vehicle_make'];
$model = $item['vehicle_model'];
$year = $item['vehicle_year'];
$vehicle = $item['vehicle_id'];
$image = $newarray[$vehicle];
if ($image != 'placeholder')
{
echo '<div class="col-sm"><div class="card" style="width: 18rem;">
<h5 class="card-title text-center font-weight-bold">'.$year.' '.$make.' '.$model.'</h5>
<img class="card-img-top" src="'.$image.'" alt="'.$year.' '.$make.' '.$model.'">
<div class="card-body">
Edit
</div>
</div></div>';
}
else
{
if ($item['vehicle_type'] == 'atv') {
$image = '/new/images/vehicles/types/atv.png';
}
elseif ($item['vehicle_type'] == 'jeep') {
$image = '/new/images/vehicles/types/jeep.png';
}
elseif ($item['vehicle_type'] == 'motorcycle') {
$image = '/new/images/vehicles/types/motorchycle.png';
}
echo '<div class="col-sm"><div class="card" style="width: 18rem;">
<h5 class="card-title text-center font-weight-bold">'.$year.' '.$make.' '.$model.'</h5>
<img class="card-img-top" src="'.$image.'" alt="'.$year.' '.$make.' '.$model.'">
<div class="card-body">
Edit
</div>
</div></div>';
}
}
}
?>
</div>
</div>

Parsing html using php to an array

I have the below html
<p>text1</p>
<ul>
<li>list-a1</li>
<li>list-a2</li>
<li>list-a3</li>
</ul>
<p>text2</p>
<ul>
<li>list-b1</li>
<li>list-b2</li>
<li>list-b3</li>
</ul>
<p>text3</p>
Does anyone have an idea to parse this html file with php to get this output using complex array
fist one for the tags "p"
and the second for tags "ul" because after above every "p" tag a tag "ul"
Array
(
[0] => Array
(
[value] => text1
(
[il] => list-a1
[il] => list-a2
[il] => list-a3
)
)
[1] => Array
(
[value] => text2
(
[il] => list-b1
[il] => list-b2
[il] => list-b3
)
)
)
I can't use replace or removing all tags cause I use
foreach ($doc->getElementsByTagName('p') as $link)
{
$dont = $link->textContent;
if (strpos($dont, 'document.') === false) {
$links2[] = array(
'value' => $link->textContent, );
}
$er=0;
foreach ($doc->getElementsByTagName('ul') as $link)
{
$dont2 = $link->nodeValue;
//echo $dont2;
if (strpos($dont2, 'favorisContribuer') === false) {
$links3[]= array(
'il' => $link->nodeValue, );
}
You could use the DOMDocument class (http://php.net/manual/en/class.domdocument.php)
You can see an example below.
<?php
$html = '
<p>text1</p>
<ul>
<li>list-a1</li>
<li>list-a2</li>
<li>list-a3</li>
</ul>
<p>text2</p>
<ul>
<li>list-b1</li>
<li>list-b2</li>
<li>list-b3</li>
</ul>
<p>text3</p>
';
$doc = new DOMDocument();
$doc->loadHTML($html);
$textContent = $doc->textContent;
$textContent = trim(preg_replace('/\t+/', '<br>', $textContent));
echo '
<!DOCTYPE html>
<html>
<head>
<title></title>
</head>
<body>
' . $textContent . '
</body>
</html>
';
?>
However, I would suggest using javascript to find the content and send it to php instead.

Echo datas from an array with php

I have the following array in php:
$stats = array(
"Item 1" => 20,
"Item 2" => 30,
"Item 3" => 66,
"Item 4" => 1
);
I need to echo these values, so I try this:
<?
foreach ($stats as $stat => $data) {
echo '
<div class="col-sm-6">
<div class="widget">
<div class="widget-body p-t-lg">
<div class="clearfix m-b-md">
<h1 class="pull-left text-primary m-0 fw-500"><span class="counter" data-plugin="counterUp">'.$data.'</span></h1>
<div class="pull-right watermark"><i class="fa fa-2x fa-tv"></i></div>
</div>
<p class="m-b-0 text-muted">'.$stats[$stat].'</p>
</div>
</div>
</div>
';
}
?>
But I have only the numerical values echoed.
Do you have the solution ?
Thanks.
Since you are using the foreach construct, $stat holds the keys and $data holds the values. So when saying echo $stats[$stat] is equivalent to echoing the value that has the key $stat. If you want to echo the keys you should do this : echo $stat.
Here you are trying to print the index of the array,
$array = array(0 => 'blue', 1 => 'red', 2 => 'green', 3 => 'red');
$key = array_search('green', $array); // $key = 2;
echo $key;
Use this code to print the key of the array

How do I merge same array without show it duplicate

I want to merge array which have same key to be one. Example
$options = array(
array("group" => "header","title" => "Content 1"),
array("group" => "header","title" => "Content 2"),
array("group" => "menu","title" => "Content 3"),
array("group" => "content","title" => "Content 4"),
array("group" => "content","title" => "Content 5"),
array("group" => "content","title" => "Content 6"),
array("group" => "footer","title" => "Content 7")
);
foreach ($options as $value) {
if ($value['group']) {
echo "<div class='{$value['group']}'>";
echo $value['title'];
echo "</div>";
}
}
Current output is
<div class='header'>Content 1</div><div class='header'>Content 2</div><div class='menu'>Content 3</div><div class='content'>Content 4</div><div class='content'>Content 5</div><div class='content'>Content 6</div><div class='footer'>Content 7</div>
What I want here is to be
<div class='header'>
Content 1
Content 2
</div>
<div class='menu'>
Content 3
</div>
<div class='content'>
Content 4
Content 5
Content 6
</div>
<div class='footer'>
Content 7
</div>
Let me know
$grouped = array();
foreach($options as $option) {
list($group, $title) = array_values($option);
if (!isset($grouped[$group])) {
$grouped[$group] = array();
}
$grouped[$group][] = $title;
}
foreach ($grouped as $group => $titles) {
echo sprintf('<div class="%s">%s</div>', $group, implode('', $titles));
}
$groups = array ();
foreach ( $options as $value ) {
if ( !isset ( $groups[$value['group']] ) ) {
$groups[]['group'] = $value['group']
}
$groups[$value['group']]['title'][] = $value['title'];
}
foreach ( $groups as $group ) {
echo "<div class="{$group['group']}">";
echo implode ( "\n", $group['title'] );
echo "</div>";
}
This should work, but if it doesn't matter to you, you could also just change the structure of your hardcoded-array, then you wouldn't need my first foreach.

Categories