i am reading the html source code of instagram post by using the CURL. I am able to do this on localhost but when i test the code on live domain then meta tags with og property like og:type is missing, it only showing at localhost.
This is the complete code.
<?php
function get_domain($url)
{
$pieces = parse_url($url);
$domain = isset($pieces['host']) ? $pieces['host'] : $pieces['path'];
if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i',
$domain, $regs)) {
return $regs['domain'];
}
return false;
}
//run curl here and get html code of instagram post page
function file_get_contents_curl($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
//check instagram url
function checkinstaurl($urlhere) {
//remove white space
$urlhere = trim($urlhere);
$urlhere = htmlspecialchars($urlhere);
///remove white space
if (get_domain($urlhere) == "instagram.com") {
//getting the meta tag data
$html = file_get_contents_curl($urlhere);
//parsing begins here:
$doc = new DOMDocument();
#$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('title');
//get and display what you need:
$title = $nodes->item(0)->nodeValue;
$metas = $doc->getElementsByTagName('meta');
$mediatype = null;
$description = null;
for ($i = 0; $i < $metas->length; $i++)
{
$meta = $metas->item($i);
if($meta->getAttribute('property') == 'og:type')
$mediatype = $meta -> getAttribute('content');
if($mediatype == 'video') {
if($meta->getAttribute('property') == 'og:video')
$description = $meta -> getAttribute('content');
} else {
if($meta->getAttribute('property') == 'og:image')
$description = $meta -> getAttribute('content');
$mediatype = 'photo';
}
} // for loop statement
$out['mediatype'] = $mediatype;
$out['descriptionc'] = $description;
return $out;
//getting the meta tag data
}
}
/*output*/
$igurl = 'https://www.instagram.com/p/COf0dN0M8pU/';
$output = checkinstaurl($igurl);
echo "<pre>";
print_r($output);
?>
This above code, At Localhost returns the complete html with meta tags but on live domain meta tags with og property is missing.
I am using two JSON feed sources and PHP to display a real estate property slideshow with agents on a website. The code was working prior to the feed provider making changes to where they store property and agent images. I have made the necessary adjustments for the images, but the feed data is not working now. I have contacted the feed providers about the issue, but they say the problem is on my end. No changes beyond the image URLs were made, so I am unsure where the issue may be. I am new to JSON, so I might be missing something. I have included the full script below. Here are the two JSON feed URLs: http://century21.ca/FeaturedDataHandler.c?DataType=4&EntityType=2&EntityID=2119 and http://century21.ca/FeaturedDataHandler.c?DataType=3&AgentID=27830&RotationType=1. The first URL grabs all of the agents and the second grabs a single agent's properties. The AgentID value is sourced from the JSON feed URL dynamically.
class Core
{
private $base_url;
private $property_image_url;
private $agent_id;
private $request_agent_properties_url;
private $request_all_agents_url;
private function formatJSON($json)
{
$from = array('Props:', 'Success:', 'Address:', ',Price:', 'PicTicks:', ',Image:', 'Link:', 'MissingImage:', 'ShowingCount:', 'ShowcaseHD:', 'ListingStatusCode:', 'Bedrooms:', 'Bathrooms:', 'IsSold:', 'ShowSoldPrice:', 'SqFootage:', 'YearBuilt:', 'Style:', 'PriceTypeDesc:');
$to = array('"Props":', '"Success":', '"Address":', ',"Price":', '"PicTicks":', ',"Image":', '"Link":', '"MissingImage":', '"ShowingCount":', '"ShowcaseHD":', '"ListingStatusCode":', '"Bedrooms":', '"Bathrooms":', '"IsSold":', '"ShowSoldPrice":', '"SqFootage":', '"YearBuilt":', '"Style":', '"PriceTypeDesc":' );
return str_ireplace($from, $to, $json); //returns the clean JSON
}
function __construct($agent=false)
{
$this->base_url = 'http://www.century21.ca';
$this->property_image_url = 'http://images.century21.ca';
$this->agent_id = ($agent ? $agent : false);
$this->request_all_agents_url =
$this->base_url.'/FeaturedDataHandler.c?DataType=4&EntityType=3&EntityID=3454';
$this->request_agent_properties_url =
$this->base_url.'/FeaturedDataHandler.c?DataType=3'.'&AgentID='.$this->agent_id.'&RotationType=1';
}
/**
* getSlides()
*/
function getSlides()
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $this->request_all_agents_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);
$response = curl_exec($ch);
curl_close($ch);
if (empty($response))
return false;
else
$agents = $this->decode_json_string($response);
// Loop Agents And Look For Requested ID
foreach ($agents as $agent)
{
if (($this->agent_id != false) && (isset($agent['WTLUserID'])) && ($agent['WTLUserID'] != $this->agent_id))
{
continue; // You have specified a
}
$properties = $this->getProperties($agent['WTLUserID']);
$this->print_property_details($properties, $agent);
}
}
/**
* getProperties()
*/
function getProperties($agent_id)
{
$url = $this->base_url.'/FeaturedDataHandler.c?DataType=3'.'&AgentID='.$agent_id.'&RotationType=1';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);
$response = curl_exec($ch);
curl_close($ch);
$json = json_decode($response);
if (empty($response))
die('No response 2'); //return false;
else
$json = $this->formatJSON($this->decode_json_string($response));
var_dump($json);
die();
// return $json;
}
/**
* print_property_details()
*/
function print_property_details($properties, $agent, $html='')
{
$BASE_URL = $this->base_url;
$PROPERTY_IMAGE_URL = $this->property_image_url;
foreach ($properties as $property)
{
$img = $property['Image'];
// $img = ($property['Image'] ? $property['Image'] : "some url to a dummy image here")
if($property['ListingStatusCode'] != 'SOLD'){
$address = $property['Address'];
$shortaddr = substr($address, 0, -12);
$html .= "<div class='listings'>";
$html .= "<div class='property-image'>";
$html .= "<img src='". $PROPERTY_IMAGE_URL ."' width='449' height='337' alt='' />";
$html .= "</div>";
$html .= "<div class='property-info'>";
$html .= "<span class='property-price'>". $property['Price'] ."</span>";
$html .= "<span class='property-street'>". $shortaddr ."</span>";
$html .= "</div>";
$html .= "<div class='agency'>";
$html .= "<div class='agent'>";
$html .= "<img src='". $agent['PhotoUrl']. "' class='agent-image' width='320' height='240' />";
$html .= "<span class='agent-name'><b>Agent:</b>". $agent['DisplayName'] ."</span>";
$html .= "</div>";
$html .= "</div>";
$html .= "</div>";
}
}
echo $html;
}
function decode_json_string($json)
{
// Strip out junk
$strip = array("{\"Agents\": [","{Props: ",",Success:true}",",\"Success\":true","\r","\n","[{","}]");
$json = str_replace($strip,"",$json);
// Instantiate array
$json_array = array();
foreach (explode("},{",$json) as $row)
{
/// Remove commas and colons between quotes
if (preg_match_all('/"([^\\"]+)"/', $row, $match)) {
foreach ($match as $m)
{
$row = str_replace($m,str_replace(",","|comma|",$m),$row);
$row = str_replace($m,str_replace(":","|colon|",$m),$row);
}
}
// Instantiate / clear array
$array = array();
foreach (explode(',',$row) as $pair)
{
$var = explode(":",$pair);
// Add commas and colons back
$val = str_replace("|colon|",":",$var[1]);
$val = str_replace("|comma|",",",$val);
$val = trim($val,'"');
$val = trim($val);
$key = trim($var[0]);
$key = trim($key,'{');
$key = trim($key,'}');
$array[$key] = $val;
}
// Add to array
$json_array[] = $array;
}
return $json_array;
}
}
Try this code to fix the JSON:
$url = 'http://century21.ca/FeaturedDataHandler.c?DataType=3&AgentID=27830&RotationType=1';
$invalid_json = file_get_contents($url);
$json = preg_replace("/([{,])([a-zA-Z][^: ]+):/", "$1\"$2\":", $invalid_json);
var_dump($json);
All your keys need to be double-quoted
JSON on the second URL is not a valid JSON, that's why you're not getting the reults, as PHP unable to decode that feed.
I tried to process it, and get this error
Error: Parse error on line 1:
{Props: [{Address:"28
-^
Expecting 'STRING', '}'
Feed image for first URL
and here is view of 2nd URL's feed
as per error for second feed, all the keys should be wrapped within " as these are strings rather than CONSTANTS.
e.g.
Props should be "Props" and all other too.
EDIT
You need to update your functionand add this one(formatJSON($json)) to your class
// Update this function, just need to update last line of function
function getProperties($agent_id)
{
$url = $this->base_url.'/FeaturedDataHandler.c?DataType=3'.'&AgentID='.$agent_id.'&RotationType=1';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);
$response = curl_exec($ch);
curl_close($ch);
$json = json_decode($response);
if (empty($response))
die('No response 2'); //return false;
else
return $this->formatJSON($this->decode_json_string($response)); //this one only need to be updated.
}
//add this function to class. This will format json
private function formatJSON($json){
$from= array('Props:', 'Success:', 'Address:', ',Price:', 'PicTicks:', ',Image:', 'Link:', 'MissingImage:', 'ShowingCount:', 'ShowcaseHD:', 'ListingStatusCode:', 'Bedrooms:', 'Bathrooms:', 'IsSold:', 'ShowSoldPrice:', 'SqFootage:', 'YearBuilt:', 'Style:', 'PriceTypeDesc:');
$to = array('"Props":', '"Success":', '"Address":', ',"Price":', '"PicTicks":', ',"Image":', '"Link":', '"MissingImage":', '"ShowingCount":', '"ShowcaseHD":', '"ListingStatusCode":', '"Bedrooms":', '"Bathrooms":', '"IsSold":', '"ShowSoldPrice":', '"SqFootage":', '"YearBuilt":', '"Style":', '"PriceTypeDesc":' );
return str_ireplace($from, $to, $json); //returns the clean JSON
}
EDIT
I've tested that function, and it's working fine, may be there is something wrong with your function decode_json_string($json)
I've taken unclean json from second URL, and cleaning it here, and putting that cleaned json in json editor to check either it's working or not HERE
I would like to do parse XML for the url:
http://www.opencellid.org/cell/get?key=myapikey&mcc=250&mnc=99&cellid=29513&lac=0
I am the beginner of PHP and tried the code as below:
$url = "http://www.opencellid.org/cell/get?key=myapikey&mcc=250&mnc=99&cellid=29513&lac=0";
$xml = simplexml_load_file($url) or die("feed not loading");
print_r($xml);
var_dump($xml);
I would like to do echo for each attribute e.g. lat, lon, range.. for this XML url.
I found many resource in stackoverflow which the XML is quite standard. I cannot find an example which is used for this format of XML:
Anyone could give me an idea? Thanks.
$url = "http://www.opencellid.org/cell/get?key=myapikey&mcc=250&mnc=99&cellid=29513&lac=0";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($ch);
$status = curl_getinfo($ch,CURLINFO_HTTP_CODE);
curl_close($ch);
if($status==200){
$x = new SimpleXMLElement($data,LIBXML_NOCDATA);
echo 'lat: '.$x->cell['lat'];
echo 'lon: '.$x->cell['lon'];
echo 'range: '.$x->cell['range'];
}
Update:
This is an alternative to curl. If you have curl installed, use curl, it is faster.
$url = "http://www.opencellid.org/cell/get?key=myapikey&mcc=250&mnc=99&cellid=29513&lac=0";
$data = file_get_contents($url);
$x = new SimpleXMLElement($data,LIBXML_NOCDATA);
$lat = $x->cell['lat'];
$lng = $x->cell['lon'];
$range = $x->cell['range'];
echo 'lat: '.$lat.'<br>';
echo 'lng: '.$lng.'<br>';
echo 'range: '.$range.'<br>';
Please try this
$content = file_get_contents('http://www.opencellid.org/cell/get?key=myapikey&mcc=250&mnc=99&cellid=29513&lac=0');
$x = new SimpleXmlElement($content);
foreach($x->cell as $entry) {
echo $entry['lat']."==".$entry['lon']; exit;
}
$url = "http://www.opencellid.org/cell/get?key=myapikey&mcc=250&mnc=99&cellid=29513&lac=0";
$xml = simplexml_load_file($url) or die("feed not loading");
$attr = array("lat", "lon"); //list attr you require
foreach($xml->cell as $cell){
$data = $cell->attributes();
foreach ($attr as $key) {
// echo "<br>key: $key, value: " . $data[$key];
//edit to insert
$columns[] = $key;
$values[] = $data[$key];
}
echo $query = "insert into table_name(".implode(',',$columns).") values (".implode(',',$values).")";
}
This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
Screen scapingin in php using file_get_contents
Can anyone help me.. I am trying to scrape Hotel reviews from LateRooms.com dont tell me its a bad idea because I already have permission as an affiliate
My code:
<?php
header('content-type: text/plain');
$contents = file_get_contents('http://www.laterooms.com/en/hotel-reviews/238902_the-westfield-bb-sandown.aspx');
$contents = preg_replace('/\s(1,)/', ' ', $contents);
print $contents . "\n";
$records = preg_split('/<div id="review/', $contents);
for ($ix = 1; $ix < count($records); $ix++) {
$tmp = $records[$ix];
preg_match('/id="review"/', $tmp, $match_reviews);
print_r($match_reviews);
exit();
}
?>
This works really well the only problem is that It pulls in the whole page of code and doesnt match the div id 'review'
Thanks in advance
function file_get_contents_curl($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
function DOMinnerHTML($element){
$innerHTML = "";
$children = $element->childNodes;
foreach ($children as $child)
{
$tmp_dom = new DOMDocument();
$tmp_dom->appendChild($tmp_dom->importNode($child, true));
$innerHTML.=trim($tmp_dom->saveHTML());
}
return $innerHTML;
}
$url = 'http://www.laterooms.com/en/hotel-reviews/238902_the-westfield-bb-sandown.aspx';
$html = file_get_contents_curl($url);
//parsing begins here:
$doc = new DOMDocument();
#$doc->loadHTML($html);
$div_elements = $doc->getElementsByTagName('div');
if ($div_elements->length <> 0){
foreach ($div_elements as $div_element) {
if ($div_element->getAttribute('class') == 'review newReview'){
$reviews[] = DOMinnerHTML($div_element);
}
}
}
print_r($reviews);
Try this, it will return all reviews. You can refine the content as per your requirement.