php error when XML item does not exist - php

I am using PHP to grab an XML feed and display it in my website, the feed is coming from
This NewsReach Blog.
I am using some simple PHP code to get the details as show below:
$feed = new SimpleXMLElement('http://blog.newsreach.co.uk/atom.xml', null, true);
$i = 0;
foreach($feed->entry as $entry)
{
if ($i < 4)
{
$title = mysql_real_escape_string("{$entry->title}");
$summary = mysql_real_escape_string("{$entry->content}");
$summary = strip_tags($summary);
$summary = preg_replace('/\s+?(\S+)?$/', '', substr($summary, 0, 100));
$url = mysql_real_escape_string("{$entry->link[4]['href']}");
$media = $entry->children('http://search.yahoo.com/mrss/');
$attrs = $media->thumbnail[0]->attributes();
$img = $attrs['url'];
}
}
The problem that I have is that the media thumbnail tag does not exist in every blog post which causes an error to appear and stop the XML Grabber from functioning.
I have tired things like:
if ($media == 0)
{
}
else
{
$attrs = $media->thumbnail[0]->attributes();
$img = $attrs['url'];
}
or
if ($media['thumbnail'] == 0)
{
}
else
{
$attrs = $media->thumbnail[0]->attributes();
$img = $attrs['url'];
}
which I had no luck with, I was hoping someone could help me check if the XML Item existed and then process depending on that.
Thanks all

You could check if it's set and not empty:
$img = '';
if (!empty($media->thumbnail[0])) {
$attrs = $media->thumbnail[0]->attributes();
$img = $attrs['url'];
}
Remember that $media is an object, you can't access it like an array ($media['thumbnail'] should be $media->thumbnail).

Related

Skipping or scraping <script> tags

I'm trying to scrape some information from a web page with the Simple HTML Dom Parser. Some issues are causing elements with a tag to cause an off set in my counters.
The tag looks like:
// <div id="result-title-2" class="offerList-item-description-title">
<script type="text/javascript">
document.write(getContents('##UD9Jj\>2?E:4 9:;23'));
</script>Ro­mantic hijab
</div>
I either need to be able to get the contents or make my programme skip it.
This is how I am currently grabbing all of my Elements:
foreach($html->find('.offerList-item') as $element)
{
$count++;
foreach($element->find('.offerList-item-image img') as $image)
{
//$images[] = '<img src="'.$image->src.'">'.'<br>';//$img->src;
$images[] = $image->src;//$img->src;
}
foreach($element->find('.offerList-item-description-title') as $title)
{
$titles[] = $title->innertext;
}
//foreach($element->find('.priceRange-from') as $price) {
foreach($element->find('.priceRange-from')as $price){
$pound = $price->find('text',1);
$number = $price->find('text',2);
$prices[] = $pound.' '.$number;
}
foreach($element->find('.offerList-itemWrapper') as $compare) //Get store links
{
$links[] = $idealo.$compare->getAttribute('href');
}
}
Download html page to local disk after delete script tags
$url2 = "https://www.idealo.co.uk/mscat.html?q=Dashboard+Cleaner";
//Code to get the file...
$data2 = file_get_contents($url2);
//save as?
$filename = "test.html";
//save the file...
$fh = fopen($filename,"w");
fwrite($fh,$data2);
fclose($fh);
After this codes.
Try scrape and finally delete with this
$target = array('<script type="text/javascript">', '</script>');
$convert = array('<!--<script type="text/javascript">', '</script>-->');
$result = str_replace($target, $convert, $title);

Loading content from remote site doesn't work, but why?

I'm still working on this catalogue for a client, which loads images from a remote site via PHP and the Simple DOM Parser.
// Code excerpt from http://internetvolk.de/fileadmin/template/res/scrape.php, this is just one case of a select
$subcat = $_GET['subcat'];
$url = "http://pinesite.com/meubelen/index.php?".$subcat."&lang=de";
$html = file_get_html(html_entity_decode($url));
$iframe = $html->find('iframe',0);
$url2 = $iframe->src;
$html->clear();
unset($html);
$fullurl = "http://pinesite.com/meubelen/".$url2;
$html2 = file_get_html(html_entity_decode($fullurl));
$pagecount = 1;
$titles = $html2->find('.tekst');
$images = $html2->find('.plaatje');
$output='';
$i=0;
foreach ($images as $image) {
$item['title'] = $titles[$i]->find('p',0)->plaintext;
$imagePath = $image->find('img',0)->src;
$item['thumb'] = resize("http://pinesite.com".str_replace('thumb_','',$imagePath),array("w"=>225, "h"=>162));
$item['image'] = 'http://pinesite.com'.str_replace('thumb_','',$imagePath);
$fullurl2 = "http://pinesite.com/meubelen/prog/showpic.php?src=".str_replace('thumb_','',$imagePath)."&taal=de";
$html3 = file_get_html($fullurl2);
$item['size'] = str_replace(' ','',$html3->find('td',1)->plaintext);
unset($html3);
$output[] = $item;
$i++;
}
if (count($html2->find('center')) > 1) {
// ok, multi-page here, let's find out how many there are
$pagecount = count($html2->find('center',0)->find('a'))-1;
for ($i=1;$i<$pagecount; $i++) {
$startID = $i*20;
$newurl = html_entity_decode($fullurl."&beginrec=".$startID);
$html3 = file_get_html($newurl);
$titles = $html3->find('.tekst');
$images = $html3->find('.plaatje');
$a=0;
foreach ($images as $image) {
$item['title'] = $titles[$a]->find('p',0)->plaintext;
$item['image'] = 'http://pinesite.com'.str_replace('thumb_','',$image->find('img',0)->src);
$item['thumb'] = resize($item['image'],array("w"=>225, "h"=>150));
$output[] = $item;
$a++;
}
$html3->clear();
unset ($html3);
}
}
echo json_encode($output);
So what it should do (and does with some categories): Output the images, the titles and the the thumbnails from this page: http://pinesite.com
This works, for example, if you pass it a "?function=images&subcat=antiek", but not if you pass it a "?function=images&subcat=stoelen". I don't even think it's a problem with the remote page, so there has to be an error in my code.
Ehm..trying to state the obvious maybe but 'stoele'?
As it turns out, my code was completely fine, it was a missing space in the HTML of the remote site that got the Simple PHP DOM Parser to not recognize the iframe I was looking for. I fixed it on my end by running a str_replace on the code first to replace the faulty code.
I know it's a dirty solution, but it works :)

Simple Youtube API Query and Save to Variable

I am trying to search youtube through the api and then save the search to a variable and then echo. Having trouble getting this to work! I have included the entire code in html. I'm not sure if it has to do with loading the youtube script library or more of a syntax error. Thanks!
<html>
<body>
<?php
$params="puppy";
function youtube_find_video($params)
{
str_replace("'", "", $params);
$q = preg_replace('/[[:space:]]/', '/', trim($params));
$q = utf8_decode(utf8_encode($q));
$replacements = array(',', '?', '!', '.');
$q = str_replace($replacements, "", $q);
$feedURL = "http://gdata.youtube.com/feeds/api/videos/-/{$q}?orderby=relevance&max-results=1";
$sxml = simplexml_load_file($feedURL);
if(!$sxml)
{
return false;
}
else{
$entry = $sxml->entry;
if(!$entry)
{
return false;
}
// get nodes in media: namespace for media information
$media = $entry->children('http://search.yahoo.com/mrss/');
if($media)
{
// get video player URL
$attrs = $media->group->player->attributes();
$url = $attrs['url'];
if(!$url)
{
return false;
break;
}
parse_str( parse_url( $url, PHP_URL_QUERY ), $my_array_of_vars );
$watch['id'] = $my_array_of_vars['v'];
// get video name
$watch['name'] = $media->group->title;
// get <yt:duration> node for video length[minute]
$yt = $media->children('http://gdata.youtube.com/schemas/2007');
$attrs = $yt->duration->attributes();
$watch['length'] = sprintf("%0.2f", $attrs['seconds']/60);
$watch = simplexml_kurtul($watch);
return $watch;
echo $watch;
}
else
{
return false;
}
}
}
youtube_find_video();
?>
</body>
</html>
You are calling youtube_find_video() without the $params. Change the last line of PHP to:
youtube_find_video($params);
Also please give the errors you get. It's impossible to help without knowing what's wrong.

having problems passing array values

I'm building a PHP program that basically grabs only image links from my twitter feed and displays them on a page, I have 3 components that I have set up that all work fine on their own.
The first component is the twitter oauth component which grabs the tweet text and creates an array, this works fine by itself.
The second is a function that processes the tweets and only returns tweets that contain image links, this as well works fine.
The program breaks down during the third section when the links are processed and an image is displayed, I had no issues running this on its own and from my attempts to trouble shoot it appears that it breaks down at the $images(); array, as that array is empty.
I'm sure I've made a silly mistake but I've been trying to find this for over a day now and can't seem to fix it. Any help would be great! Thanks guys!
code:
<?php
if ($result['socialorigin']== "twitter"){
$twitterObj = new EpiTwitter($consumer_key, $consumer_secret);
$token = $twitterObj->getAccessToken();
$twitterObj->setToken($result['oauthtoken'], $result['oauthsecret']);
$tweets = $twitterObj->get('/statuses/home_timeline.json',array('count'=>'200'));
$all_tweets = array();
$hosts = "lockerz|yfrog|twitpic|tumblr|mypict|ow.ly|instagr";
foreach($tweets as $tweet) {
$twtext = $tweet->text;
if(preg_match("~http://($hosts)~", $twtext)){
preg_match_all("#(^|[\n ])([\w]+?://[\w]+[^ \"\n\r\t<]*)#ise", $twtext, $matches, PREG_PATTERN_ORDER);
foreach($matches[0] as $key2 => $link){
array_push($all_tweets,"$link");
}
}
}
function height_compare($a1, $b1)
{
if ($a1 == $b1) {
return 0;
}
return ($a1 > $b1) ? -1 : 1;
}
foreach($all_tweets as $alltweet => $tlink){
$doc = new DOMDocument();
// Okay this is HTML is kind of screwy
// So we're going to supress errors
#$doc->loadHTMLFile($tlink);
// Get all images
$images_list = $doc->getElementsByTagName('img');
$images = array();
foreach($images_list as $image) {
// Get the src attribute
$image_source = $image->getAttribute('src');
if (substr($image_source,0,7)=="http://"){
$image_size_info = getimagesize($image_source);
$images[$image_source] = $image_size_info[1];
}
}
// Do a numeric sort on the height
uasort($images, "height_compare");
$tallest_image = array_slice($images, 0,1);
$mainimg = key($tallest_image);
echo "<img src='$mainimg' />";
}
print_r($all_tweets);
print_r($images);
}
Change the for loop where you fetch the actual images to move the images array OUTSIDE the for loop. This will prevent the loop from clearing it each time through.
$images = array();
foreach($all_tweets as $alltweet => $tlink){
$doc = new DOMDocument();
// Okay this is HTML is kind of screwy
// So we're going to supress errors
#$doc->loadHTMLFile($tlink);
// Get all images
$images_list = $doc->getElementsByTagName('img');
foreach($images_list as $image) {
// Get the src attribute
$image_source = $image->getAttribute('src');
if (substr($image_source,0,7)=="http://"){
$image_size_info = getimagesize($image_source);
$images[$image_source] = $image_size_info[1];
}
}
// Do a numeric sort on the height
uasort($images, "height_compare");
$tallest_image = array_slice($images, 0,1);
$mainimg = key($tallest_image);
echo "<img src='$mainimg' />";
}

SimpleXML get next/prev node

I'm building a photo gallery, building an object based on an xml file.
How can I grab the next and previous nodes? Here's what my base code looks like:
$xmlData = new SimpleXMLElement(file_get_contents("data.xml"));
foreach($xmlData->row as $item) {
if ($item->url == $_GET['id']) {
// show photo
$title = $item->title;
}
}
Only usable if the next/prev nodes are of the same type. If you want more complex processing use DOM.
$xmlData = new SimpleXMLElement(file_get_contents("data.xml"));
$index = 0;
foreach($xmlData->row as $item) {
if ($item->url == $_GET['id']) {
// show photo
$title = $item->title;
$prev = $xmlData->row[$index-1];
$next = $xmlData->row[$index+1];
}
$index++;
}

Categories