PHP for each argument passed to function - php

I am passing in anywhere between 1-10 arguments to a function, I would then like the function to run itself for each argument but returning the previous data plus the new data.
So I have a function like follows:
function scrape_google_result_source($link,$link2) //$link is "test" $link2 is "test2"
{
$html = $link;
$cache = $html; //this is my first return
$html = $link2;
$cache = $cache . $html; //this is my first and second return
return $cache; //now I am returning it so it will be "testtest2"
}
this works if I manually pass in $link1 and $link2 then code it to work with them, I would like it to run itself for each argument passed in then set `$cache .= new result" so I am then returning the result for all the arguments past in together.
Sadly I have no code other than this as I am not to sure where to start with this, I did find the func_num_args(); php function that could possibly work? Any help greatly appreciated.
Thanks,
Simon

try this;
function scrape_google_result_source($link,$link2)
{
$numargs = func_num_args();
foreach($numargs as $n){
$link = func_get_arg($n);
$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';
$url = $link;
$ch = curl_init();
curl_setopt($ch, CURLOPT_TIMEOUT, 100);
$html = curl_exec($ch);
$cache .= $html; //this is my first return
curl_close($ch);
}
return $cache; //now I am returning it
}
func_get_arg manual

Personally I find parsing arrays and looping through easier:
function scrape_google_result_source($links)
{
$cache = '';
if( !is_array( $links ) )
{
return 'not array';
}
foreach( $links as $key=>$link )
{
$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';
$url = $link;
$ch = curl_init();
curl_setopt($ch, CURLOPT_TIMEOUT, 100);
$html = curl_exec($ch);
$cache .= $html;
curl_close($ch);
}
return $cache; //now I am returning it
}
$links_array = array( 'http..','http...');
$html = scrape_google_result_source( $links_array );

Related

How to return class value properly in this example? PHP

I wrote a little crawler and I am wondering how to properly assign the results to the instance being called.
My constructor sets up some basic properties and calls the next method which contains an if loop which might call a foreach loop. When all is done I echo my results.
This works perfectly fine but I don't want to echo my json_encode data. I rather want my $crawler variable at the bottom to contain the json_encode data.
This is my code:
<?php
class Crawler {
private $url;
private $class;
private $regex;
private $htmlStack;
private $pageNumber = 1;
private $elementsArray;
public function __construct($url, $class, $regex=null) {
$this->url = $url;
$this->class = $class;
$this->regex = $regex;
$this->curlGet($this->url);
}
private function curlGet($url) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl, CURLOPT_URL, $url);
$this->htmlStack .= curl_exec($curl);
$response = curl_getinfo($curl, CURLINFO_HTTP_CODE);
$this->paginate($response);
}
private function paginate($response) {
if($response === 200) {
$this->pageNumber++;
$url = $this->url . '?page=' . $this->pageNumber;
$this->curlGet($url);
} else {
$this->CreateDomDocument();
}
}
private function curlGetDeep($link) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl, CURLOPT_URL, $link);
$product = curl_exec($curl);
$dom = new Domdocument();
#$dom->loadHTML($product);
$xpath = new DomXpath($dom);
$descriptions = $xpath->query('//div[contains(#class, "description")]');
foreach($descriptions as $description) {
return $description->nodeValue;
}
}
private function CreateDomDocument() {
$dom = new Domdocument();
#$dom->loadHTML($this->htmlStack);
$xpath = new DomXpath($dom);
$elements = $xpath->query('//article[contains(#class, "' . $this->class . '")]');
foreach($elements as $element) {
$title = $xpath->query('descendant::div[#class="title"]', $element);
$title = $title->item(0)->nodeValue;
$link = $xpath->query('descendant::a[#class="link-overlay"]', $element);
$link = $link->item(0)->getAttribute('href');
$link = 'https://www.gall.nl' . $link;
$image = $xpath->query('descendant::div[#class="image"]/node()/node()', $element);
$image = $image->item(1)->getAttribute('src');
$description = $this->curlGetDeep($link);
if($this->regex) {
$title = preg_replace($this->regex, '', $title);
}
if(!preg_match('/\dX(\d+)?/', $title)) {
$this->elementsArray[] = [
'title' => $title,
'link' => $link,
'image' => $image,
'description' => $description
];
}
}
echo json_encode(['beers' => $this->elementsArray]);
}
}
$crawler = new Crawler('https://www.gall.nl/shop/speciaal-bier/', 'product-block', '/\d+\,?\d*CL/i');
Github link for some more overview:
https://github.com/stephan-v/crawler/blob/master/ArticleCrawler.php
Hopefully somebody can help me out since I am a bit confused here on how to go about getting this working properly.
You cant do it in constructor. But you can assign the json to a class property and return it in another method. Thats the only logical option.
I'm too slow.. man. So i'm just extending ardabeyazoglu answer with code here:
Change echo json_encode(['beers' => $this->elementsArray]);
into $this->json = json_encode(['beers' => $this->elementsArray]);.
and then
$crawler = new Crawler(....);
var_dump($crawler->json);
You could probably add an accessor method, but a public property works, too.

Issues with decoding two JSON feed sources and display with PHP/HTML

I am using two JSON feed sources and PHP to display a real estate property slideshow with agents on a website. The code was working prior to the feed provider making changes to where they store property and agent images. I have made the necessary adjustments for the images, but the feed data is not working now. I have contacted the feed providers about the issue, but they say the problem is on my end. No changes beyond the image URLs were made, so I am unsure where the issue may be. I am new to JSON, so I might be missing something. I have included the full script below. Here are the two JSON feed URLs: http://century21.ca/FeaturedDataHandler.c?DataType=4&EntityType=2&EntityID=2119 and http://century21.ca/FeaturedDataHandler.c?DataType=3&AgentID=27830&RotationType=1. The first URL grabs all of the agents and the second grabs a single agent's properties. The AgentID value is sourced from the JSON feed URL dynamically.
class Core
{
private $base_url;
private $property_image_url;
private $agent_id;
private $request_agent_properties_url;
private $request_all_agents_url;
private function formatJSON($json)
{
$from = array('Props:', 'Success:', 'Address:', ',Price:', 'PicTicks:', ',Image:', 'Link:', 'MissingImage:', 'ShowingCount:', 'ShowcaseHD:', 'ListingStatusCode:', 'Bedrooms:', 'Bathrooms:', 'IsSold:', 'ShowSoldPrice:', 'SqFootage:', 'YearBuilt:', 'Style:', 'PriceTypeDesc:');
$to = array('"Props":', '"Success":', '"Address":', ',"Price":', '"PicTicks":', ',"Image":', '"Link":', '"MissingImage":', '"ShowingCount":', '"ShowcaseHD":', '"ListingStatusCode":', '"Bedrooms":', '"Bathrooms":', '"IsSold":', '"ShowSoldPrice":', '"SqFootage":', '"YearBuilt":', '"Style":', '"PriceTypeDesc":' );
return str_ireplace($from, $to, $json); //returns the clean JSON
}
function __construct($agent=false)
{
$this->base_url = 'http://www.century21.ca';
$this->property_image_url = 'http://images.century21.ca';
$this->agent_id = ($agent ? $agent : false);
$this->request_all_agents_url =
$this->base_url.'/FeaturedDataHandler.c?DataType=4&EntityType=3&EntityID=3454';
$this->request_agent_properties_url =
$this->base_url.'/FeaturedDataHandler.c?DataType=3'.'&AgentID='.$this->agent_id.'&RotationType=1';
}
/**
* getSlides()
*/
function getSlides()
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $this->request_all_agents_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);
$response = curl_exec($ch);
curl_close($ch);
if (empty($response))
return false;
else
$agents = $this->decode_json_string($response);
// Loop Agents And Look For Requested ID
foreach ($agents as $agent)
{
if (($this->agent_id != false) && (isset($agent['WTLUserID'])) && ($agent['WTLUserID'] != $this->agent_id))
{
continue; // You have specified a
}
$properties = $this->getProperties($agent['WTLUserID']);
$this->print_property_details($properties, $agent);
}
}
/**
* getProperties()
*/
function getProperties($agent_id)
{
$url = $this->base_url.'/FeaturedDataHandler.c?DataType=3'.'&AgentID='.$agent_id.'&RotationType=1';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);
$response = curl_exec($ch);
curl_close($ch);
$json = json_decode($response);
if (empty($response))
die('No response 2'); //return false;
else
$json = $this->formatJSON($this->decode_json_string($response));
var_dump($json);
die();
// return $json;
}
/**
* print_property_details()
*/
function print_property_details($properties, $agent, $html='')
{
$BASE_URL = $this->base_url;
$PROPERTY_IMAGE_URL = $this->property_image_url;
foreach ($properties as $property)
{
$img = $property['Image'];
// $img = ($property['Image'] ? $property['Image'] : "some url to a dummy image here")
if($property['ListingStatusCode'] != 'SOLD'){
$address = $property['Address'];
$shortaddr = substr($address, 0, -12);
$html .= "<div class='listings'>";
$html .= "<div class='property-image'>";
$html .= "<img src='". $PROPERTY_IMAGE_URL ."' width='449' height='337' alt='' />";
$html .= "</div>";
$html .= "<div class='property-info'>";
$html .= "<span class='property-price'>". $property['Price'] ."</span>";
$html .= "<span class='property-street'>". $shortaddr ."</span>";
$html .= "</div>";
$html .= "<div class='agency'>";
$html .= "<div class='agent'>";
$html .= "<img src='". $agent['PhotoUrl']. "' class='agent-image' width='320' height='240' />";
$html .= "<span class='agent-name'><b>Agent:</b>". $agent['DisplayName'] ."</span>";
$html .= "</div>";
$html .= "</div>";
$html .= "</div>";
}
}
echo $html;
}
function decode_json_string($json)
{
// Strip out junk
$strip = array("{\"Agents\": [","{Props: ",",Success:true}",",\"Success\":true","\r","\n","[{","}]");
$json = str_replace($strip,"",$json);
// Instantiate array
$json_array = array();
foreach (explode("},{",$json) as $row)
{
/// Remove commas and colons between quotes
if (preg_match_all('/"([^\\"]+)"/', $row, $match)) {
foreach ($match as $m)
{
$row = str_replace($m,str_replace(",","|comma|",$m),$row);
$row = str_replace($m,str_replace(":","|colon|",$m),$row);
}
}
// Instantiate / clear array
$array = array();
foreach (explode(',',$row) as $pair)
{
$var = explode(":",$pair);
// Add commas and colons back
$val = str_replace("|colon|",":",$var[1]);
$val = str_replace("|comma|",",",$val);
$val = trim($val,'"');
$val = trim($val);
$key = trim($var[0]);
$key = trim($key,'{');
$key = trim($key,'}');
$array[$key] = $val;
}
// Add to array
$json_array[] = $array;
}
return $json_array;
}
}
Try this code to fix the JSON:
$url = 'http://century21.ca/FeaturedDataHandler.c?DataType=3&AgentID=27830&RotationType=1';
$invalid_json = file_get_contents($url);
$json = preg_replace("/([{,])([a-zA-Z][^: ]+):/", "$1\"$2\":", $invalid_json);
var_dump($json);
All your keys need to be double-quoted
JSON on the second URL is not a valid JSON, that's why you're not getting the reults, as PHP unable to decode that feed.
I tried to process it, and get this error
Error: Parse error on line 1:
{Props: [{Address:"28
-^
Expecting 'STRING', '}'
Feed image for first URL
and here is view of 2nd URL's feed
as per error for second feed, all the keys should be wrapped within " as these are strings rather than CONSTANTS.
e.g.
Props should be "Props" and all other too.
EDIT
You need to update your functionand add this one(formatJSON($json)) to your class
// Update this function, just need to update last line of function
function getProperties($agent_id)
{
$url = $this->base_url.'/FeaturedDataHandler.c?DataType=3'.'&AgentID='.$agent_id.'&RotationType=1';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);
$response = curl_exec($ch);
curl_close($ch);
$json = json_decode($response);
if (empty($response))
die('No response 2'); //return false;
else
return $this->formatJSON($this->decode_json_string($response)); //this one only need to be updated.
}
//add this function to class. This will format json
private function formatJSON($json){
$from= array('Props:', 'Success:', 'Address:', ',Price:', 'PicTicks:', ',Image:', 'Link:', 'MissingImage:', 'ShowingCount:', 'ShowcaseHD:', 'ListingStatusCode:', 'Bedrooms:', 'Bathrooms:', 'IsSold:', 'ShowSoldPrice:', 'SqFootage:', 'YearBuilt:', 'Style:', 'PriceTypeDesc:');
$to = array('"Props":', '"Success":', '"Address":', ',"Price":', '"PicTicks":', ',"Image":', '"Link":', '"MissingImage":', '"ShowingCount":', '"ShowcaseHD":', '"ListingStatusCode":', '"Bedrooms":', '"Bathrooms":', '"IsSold":', '"ShowSoldPrice":', '"SqFootage":', '"YearBuilt":', '"Style":', '"PriceTypeDesc":' );
return str_ireplace($from, $to, $json); //returns the clean JSON
}
EDIT
I've tested that function, and it's working fine, may be there is something wrong with your function decode_json_string($json)
I've taken unclean json from second URL, and cleaning it here, and putting that cleaned json in json editor to check either it's working or not HERE

Call a URL with PHP and get XML response [duplicate]

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 7 years ago.
So I am trying to get a XML response after calling a URL with params (GET request). I found this code below, which is working.
$url = "http://...";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt($ch, CURLOPT_ENCODING, "gzip,deflate");
$response = curl_exec($ch);
curl_close($ch);
echo $response;
But as response I am getting a huge string with no commas (so I cannot explode it). And this string has only values, no keys.
Is there a way to get an associative array instead?
The XML is like:
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<transaction>
<date>2011-02-10T16:13:41.000-03:00</date>
<code>9E884542-81B3-4419-9A75-BCC6FB495EF1</code>
<reference>REF1234</reference>
<type>1</type>
<status>3</status>
<paymentMethod>
<type>1</type>
<code>101</code>
</paymentMethod>
<grossAmount>49900.00</grossAmount>
<discountAmount>0.00<discountAmount>
(...)
SO I would like to have an array like:
date => ...
code => ...
reference => ...
(and so on)
Is that possible? If so, how?
EDIT: I donĀ“t agree with the "this questions is already answered" tag. No code found on the indicated topic solved my issue. But, anyhow, I found a way, with the code below.
$url = http://...;
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
$transaction= curl_exec($curl);
curl_close($curl);
$transaction = simplexml_load_string($transaction);
var_dump($transaction); //retrieve a object(SimpleXMLElement)
I have had good luck using code like this:
$url = "http://feeds.bbci.co.uk/news/rss.xml";
$xml = file_get_contents($url);
if ($rss = new SimpleXmlElement($xml)) {
echo $rss->channel->title;
}
I use something like this (very universal solution):
http://www.akchauhan.com/convert-xml-to-array-using-dom-extension-in-php5/
Only thing is I exclude the attributes part as I don't need them for my cases
<?php
class xml2array {
function xml2array($xml) {
if (is_string($xml)) {
$this->dom = new DOMDocument;
$this->dom->loadXml($xml);
}
return FALSE;
}
function _process($node) {
$occurance = array();
foreach ($node->childNodes as $child) {
$occurance[$child->nodeName]++;
}
if ($node->nodeType == XML_TEXT_NODE) {
$result = html_entity_decode(htmlentities($node->nodeValue, ENT_COMPAT, 'UTF-8'),
ENT_COMPAT,'ISO-8859-15');
} else {
if($node->hasChildNodes()){
$children = $node->childNodes;
for ($i=0; $i < $children->length; $i++) {
$child = $children->item($i);
if ($child->nodeName != '#text') {
if($occurance[$child->nodeName] > 1) {
$result[$child->nodeName][] = $this->_process($child);
} else {
$result[$child->nodeName] = $this->_process($child);
}
} else if ($child->nodeName == '#text') {
$text = $this->_process($child);
if (trim($text) != '') {
$result[$child->nodeName] = $this->_process($child);
}
}
}
}
}
return $result;
}
function getResult() {
return $this->_process($this->dom);
}
}
?>
And call it from your script like this:
$obj = new xml2array($response);
$array = $obj->getResult();
The code is very self explanatory, Objective approach and it can easily be modified to exclude or include parts at desire.
simply load XML into DOM Object, then recursively check for children and fetch respective values.
Hope it helps

My first OOP aproach in Codeigniter

I'm trying to make a crawler to fetch templates from sites that don't offer any API access for later display as affiliate.
I just started with CI, read the documentation a few times and below you can find my first OOP approach.
My question is if I'm on the right path of OOP or if are (I'm sure there are) any improvements available to my code. I read alot of OOP tutorials on the web and people seems to have different views of the OOP coding.
Thank you in advance.
<?php
class Crawler extends CI_Model {
function __construct(){
parent::__construct();
}
function get_with_curl($url) {
if(!ini_get('allow_url_fopen')) {
return $this->html = file_get_html($url);
} else {
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible; MSIE 5.01; Windows NT 5.0)');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 10);
$str = curl_exec($curl);
curl_close($curl);
return $this->html = str_get_html($str);
}
}
function array_arrange($links){
$links = array_merge(array_unique($links));
foreach (range(1, count($links), 2) as $k) {
unset($links[$k]);
}
return array_merge($links);
}
function diff($source,$links){
$this->db->like('source', $source);
$this->db->from('themes');
$total = $this->db->count_all_results();
if($total >= count($links)){
return false;
} else {
$diff = count($links)-$total;
$data = array_slice($links,-$diff,$diff,true);
return $data;
}
}
function get_links($url,$find){
$this->html = $this->get_with_curl($url);
foreach($this->html->find($find) as $v){
$data[] = $v->href;
}
$this->html->clear();
unset($this->html);
return $data;
}
function themefyme(){
$links = $this->get_links('http://themify.me/themes','ul[class=theme-list] li a');
$links = $this->array_arrange($links);
$links = $this->diff('themefyme',$links);
if($links){
$i = 0;
foreach($links as $link){
$this->html = $this->get_with_curl($link);
$data[$i]['source'] = 'themefyme';
$data[$i]['name'] = strtok($this->html->find('h1', 0)->plaintext,' ');
$data[$i]['link'] = $link;
$data[$i]['demo'] = 'http://themify.me/demo/#theme='.strtolower($data[$i]['name']);
$data[$i]['price'] = filter_var($this->html->find('h1 sup', 0)->plaintext, FILTER_SANITIZE_NUMBER_INT);
$data[$i]['description'] = $this->html->find('big', 0)->plaintext;
$data[$i]['features'] = $this->html->find('ul', 0)->plaintext;
$data[$i]['img_large'] = $this->html->find('.theme-large-screen img', 0)->src;
$data[$i]['img_thumb'] = 'http://themify.me/wp-content/themes/themify/thumb.php?src='.$data[$i]['img_large'].'&q=90&w=220';
$i++;
$this->html->clear();
unset($this->html);
}
$this->db->insert_batch('themes', $data);
return $data;
}
return false;
}
function themefuse(){
$links = $this->get_links('http://www.themefuse.com/wp-themes-shop/','.theme-img a');
$links = $this->array_arrange($links);
$links = $this->diff('themefuse',$links);
if($links){
$i = 0;
foreach($links as $link){
$this->html = $this->get_with_curl($link);
$data[$i]['source'] = 'themefuse';
$data[$i]['name'] = $this->html->find('.theme-price', 0)->plaintext;
$data[$i]['link'] = $link;
$data[$i]['demo'] = 'http://themefuse.com/demo/wp/'.strtolower($data[$i]['name']).'/';
$data[$i]['description'] = $this->html->find('.short-descr', 0)->plaintext;
$data[$i]['highlights'] = $this->html->find('.highlights', 0)->outertext;
$data[$i]['features'] = $this->html->find('.col-features', 0)->outertext;
$data[$i]['theme_info'] = $this->html->find('.col-themeinfo', 0)->outertext;
preg_match("/src=(.*?)&/",$this->html->find('.slideshow img', 0)->src, $img);
$data[$i]['img_large'] = $img[1];
$data[$i]['img_thumb'] = 'http://themefuse.com/wp-content/themes/themefuse/thumb.php?src='.$img[1].'&h=225&w=431&zc=1&q=100';
$i++;
$this->html->clear();
unset($this->html);
}
$this->db->insert_batch('themes', $data);
return $data;
}
return false;
}
}
As PeeHaa says, your example isn't really OOP. OOP means Object Oriented Programming, which basically means that your classes (objects) should represent an entity as if it was a physical object. So your class would be a group of functions (methods) that relate to the object.
For example, a Robot Object, might have functions like, moveForward, moveBackword, speak etc.
And you might have another Robot type that can do all of the things that the Robot object can do, but in a slightly different way. For example, your might have a MoonRobot object that extends the Robot object (which would inherit all of Robots functions) but it's moveForward function might be different, so this can be altered in the MoonRobot class.

PHP Data Extraction From External Website, Then Write to Database [duplicate]

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 9 years ago.
Just wondering how this would be done. Let's say there's a simple HTML table on an external website, and you have a database with the same structure as that HTML table. I understand that you can use file_get_contents to grab that entire web page.
From there, I would assume that you would remove everything from your file_get_contents except for the stuff between the <table></table> tags, thus isolating the table containing the data you wish to write.
What is the next step? Assuming your database table structure matches the structure of the HTML table, what would be the easiest way to write the table data into your database?
Perhaps this will be of interest(hope so lol), a super simple class to parse html.
Using only DOMDocument and cURL
<?php
$scraper = new DOMScraper();
//example couldent think of a site with an example table
$scraper->setSite('http://cherone.co.uk/forum')->setSource();
//all tables on page
echo '<table>'.$scraper->getInnerHTML('table').'</table>';
//get only tables with id="some_table_id" or any attribute match eg class="somthing"
echo '<table>'.$scraper->getInnerHTML('table','id=some_table_id').'</table>';
//get all tables contents but return only nodeValue/text
echo '<table>'.$scraper->getInnerHTML('table','id=some_table_id',true).'</table>';
/**
* Generic DOM scapper using DOMDocument and cURL
*/
Class DOMScraper extends DOMDocument{
public $site;
private $source;
private $dom;
function __construct(){
libxml_use_internal_errors(true);
$this->preserveWhiteSpace = false;
$this->strictErrorChecking = false;
}
function setSite($site){
$this->site = $site;
return $this;
}
function setSource(){
if(empty($this->site))return 'Error: Missing $this->site, use setSite() first';
$this->source = $this->get_data($this->site);
return $this;
}
function getInnerHTML($tag, $id=null, $nodeValue = false){
if(empty($this->site))return 'Error: Missing $this->source, use setSource() first';
$this->loadHTML($this->source);
$tmp = $this->getElementsByTagName($tag);
$ret = null;
foreach ($tmp as $v){
if($id !== null){
$attr = explode('=',$id);
if($v->getAttribute($attr[0])==$attr[1]){
if($nodeValue == true){
$ret .= trim($v->nodeValue);
}else{
$ret .= $this->innerHTML($v);
}
}
}else{
if($nodeValue == true){
$ret .= trim($v->nodeValue);
}else{
$ret .= $this->innerHTML($v);
}
}
}
return $ret;
}
function innerHTML($dom){
$ret = "";
$nodes = $dom->childNodes;
foreach($nodes as $v){
$tmp = new DOMDocument();
$tmp->appendChild($tmp->importNode($v, true));
$ret .= trim($tmp->saveHTML());
}
return $ret;
}
function get_data($url){
if(function_exists('curl_init')){
$ch = curl_init();
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}else{
return file_get_contents($url);
}
}
}
?>
You can use PHP Simple HTML DOM Parser for example

Categories