Send ID with DOMNodeList - php

I have this splendid code
public static function city($city)
{
$html = file_get_contents("http://www.example.com/?=$city");
libxml_use_internal_errors(true);
$doc = new \DOMDocument();
if($doc->loadHTML($html))
{
$result = new \DOMDocument();
$xpath = new \DOMXPath($doc);
$movies = $xpath->query("/html/body");
$url = $xpath->query("/html/body/a/#");
return $movies;
}
}
This function returns as expected a list om movies, and i fetch them with $movies->nodeValue
But i want to send an ID as well.
So i try to add this snipp (to explode the ID from the URL):
foreach($url as $MovieId){
$data=parse_url($MovieId->nodeValue, PHP_URL_QUERY);
$splittedstring=explode("&",$data);
$id[] = substr(strstr($splittedstring[0], "="),1);
}
And this works perfect when i just echo it like this:
$i=0;
echo '<ul>';
foreach ($movies as $key => $movie) {
echo '<li>'.$movie->nodeValue . '</li>';
$i++;
}
echo '</ul>';
But when trying to send it from the Model, to the view with:
$data = new stdClass;
$data->movie_title = $movies;
$data->movie_id = $id;
return $data;
I get: Notice: Undefined property: DOMNodeList::$nodeValue...
My question: How can i either
convert the $movies to a stdClass
send the id with the object(DOMNodeList) (and retrieve it how?)

Related

Extracting information from <i> tag from HTML using PHP

I am having some code and getting HTTP 500 Error. A bit getting confused. I need to extract from the web of weather cast weather digit information and add in the website.
Here is a code:
orai_class.php
<?php
Class orai{
var $url;
function generate_orai($url){
$html = file_get_contents($url);
$classname = 'wi wi-1';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$results = $xpath->query("//*[#class='" . $classname . "']");
$i=0;
foreach($results as $node)
{
if ($results->length > 0) {
$array[] = $results->item($i)->nodeValue;
}
$i++;
}
return $array;
}
}
?>
index.php
<?php
include("orai.class.php");
$orai = new orai();
print_r($orai->generate_orai('https://orai.15min.lt/prognoze/vilnius'));
?>
Thank You.

While Loop in php not working properly to fetch content using file_get_contents

I am fetching pagination urls list from another website using file_get_contents but the while loop won't work, it fetches data of the first url and thats it won't work on the second url of the array fetched from my database of urls.
include('simple_html_dom.php');
ini_set('max_execution_time', 0);
$con=mysqli_connect("localhost","root","","mydb");
$crawl_query = "SELECT url from list LIMIT 10";
$crawl_url = mysqli_query($con, $crawl_query);
$rows = array();
while($rows = mysqli_fetch_array($crawl_url))
{
$pages = $rows['url'];
$html = file_get_contents($pages);
$dom = new DOMDocument();
#$dom->loadHTML($html);
$finder = new DomXPath($dom);
$links = $finder->query("//*[contains(#class, 'pages')]");
$array1= array();
foreach ($links as $link){
$array1[] = $link;
$length[] = $array1[0]->getElementsByTagName('a');
$final_length = $length[0]->length -1;
for($i=1; $i<=$final_length; $i++ )
{
if($i==1)
{
echo rtrim($pages, ".html").trim($i, "1");
echo "<br/>";
}
else
{
echo rtrim($pages, ".html")."_".trim($i, "1");
echo "<br/>";
}
}
}
}
All I get is
example.com/content/new
example.com/content/new2
example.com/content/new3
which is the result of first url in my database. That url has 3 pages in it's pagination but I can't get the loop to work on second url from mydb.

Why the query doesn't match the DOM?

Here is my code:
$res = file_get_contents("http://www.lenzor.com/photo/search/index/type/user/%D8%B9%D9%84%DB%8C//text/%D9%81%D8%A7%D8%B7%D9%85%D9%87");
$doc = new \DOMDocument();
#$doc->loadHTMLFile($res);
$xpath = new \DOMXpath($doc);
$links = $xpath->query("//ul[#class='user_box']/li");
$result = array();
if (!is_null($links)) {
foreach ($links as $link) {
$href = $link->getAttribute('class');
$result[] = [$href];
}
}
print_r($result);
Here is the content I'm working on. I mean it's the result of echo $res.
Ok well, the result of my code is an empty array. So $links is empty and that foreach won't be executed. Why? Why //ul[#class='user_box']/li query doesn't match the DOM ?
Expected result is an array contains the class attribute of lis.
Try this, Hope this will be helpful. There are few mistakes in your code.
1. You should search like this '//ul[#class="user_box clearfix"]/li' because class="user_box clearfix" class attribute of that HTML source contains two classes.
2. You should use loadHTMLinstead of loadHTMLFile.
<?php
ini_set('display_errors', 1);
libxml_use_internal_errors(true);
$res = file_get_contents("http://www.lenzor.com/photo/search/index/type/user/%D8%B9%D9%84%DB%8C//text/%D9%81%D8%A7%D8%B7%D9%85%D9%87");
$doc = new \DOMDocument();
$doc->loadHTML($res);
$xpath = new \DOMXpath($doc);
$links = $xpath->query('//ul[#class="user_box clearfix"]/li');
$result = array();
if (!is_null($links)) {
foreach ($links as $link) {
$href = $link->getAttribute('class');
$result[] = [$href];
}
}
print_r($result);

Get Element by ClassName with DOMdocument() Method

Here is what I am trying to achieve : retrieve all products on a page and put them into an array. Here is the code I am using :
$page2 = curl_exec($ch);
$doc = new DOMDocument();
#$doc->loadHTML($page2);
$nodes = $doc->getElementsByTagName('title');
$noders = $doc->getElementsByClassName('productImage');
$title = $nodes->item(0)->nodeValue;
$product = $noders->item(0)->imageObject.src;
It works for the $title but not for the product. For info, in the HTML code the img tag looks like this :
<img alt="" class="productImage" data-altimages="" src="xxxx">
I have been looking at this (PHP DOMDocument how to get element?) but I still don't understand how to make it work.
PS : I get this error :
Call to undefined method DOMDocument::getElementsByclassName()
I finally used the following solution :
$classname="blockProduct";
$finder = new DomXPath($doc);
$spaner = $finder->query("//*[contains(#class, '$classname')]");
https://stackoverflow.com/a/31616848/3068233
Linking this answer as it helped me the most with this problem.
function getElementsByClass(&$parentNode, $tagName, $className) {
$nodes=array();
$childNodeList = $parentNode->getElementsByTagName($tagName);
for ($i = 0; $i < $childNodeList->length; $i++) {
$temp = $childNodeList->item($i);
if (stripos($temp->getAttribute('class'), $className) !== false) {
$nodes[]=$temp;
}
}
return $nodes;
}
Theres the code and heres the usage
$dom = new DOMDocument('1.0', 'utf-8');
$dom->loadHTML($html);
$content_node=$dom->getElementById("content_node");
$div_a_class_nodes=getElementsByClass($content_node, 'div', 'a');
function getElementsByClassName($dom, $ClassName, $tagName=null) {
if($tagName){
$Elements = $dom->getElementsByTagName($tagName);
}else {
$Elements = $dom->getElementsByTagName("*");
}
$Matched = array();
for($i=0;$i<$Elements->length;$i++) {
if($Elements->item($i)->attributes->getNamedItem('class')){
if($Elements->item($i)->attributes->getNamedItem('class')->nodeValue == $ClassName) {
$Matched[]=$Elements->item($i);
}
}
}
return $Matched;
}
// usage
$dom = new \DOMDocument('1.0');
#$dom->loadHTML($html);
$elementsByClass = getElementsByClassName($dom, $className, 'h1');

Xpath for extracting links

I create an scraper for an automoto site and first I want to get all manufactures and after that all links of models for each manufactures but with the code below I get only the first model on the list. Why?
<?php
$dom = new DOMDocument();
#$dom->loadHTMLFile('http://www.auto-types.com');
$xpath = new DOMXPath($dom);
$entries = $xpath->query("//li[#class='clearfix_center']/a/#href");
$output = array();
foreach($entries as $e) {
$dom2 = new DOMDocument();
#$dom2->loadHTMLFile('http://www.auto-types.com' . $e->textContent);
$xpath2 = new DOMXPath($dom2);
$data = array();
$data['newLinks'] = trim($xpath2->query("//div[#class='modelImage']/a/#href")->item(0)->textContent);
$output[] = $data;
}
echo '<pre>' . print_r($output, true) . '</pre>';
?>
SO I need to get: mercedes/100, mercedes/200, mercedes/300 but now with my script i get only the first link so mercedes/100...
please help
You need to iterate through the results instead of just taking the first item:
$items = $xpath2->query("//div[#class='modelImage']/a/#href");
$links = array();
foreach($items as $item) {
$links[] = $item->textContent;
}
$data['newLinks'] = implode(', ', $links);

Categories