I'm working on a rss feed for a website I made. It takes input from my home made news function on the site, which is stored in a MySQL database.
Now I can get the text nicely enough, but when I try to use <enclosure> to put in an image, nothing shows up.
The code i use to insert the code is as follows:
if($rows['image'] != 0) {
$image = mysql_fetch_array(mysql_query("SELECT * FROM dafl_news_imagedb WHERE id = '".$rows['image']."' LIMIT 1"));
$imageUrl = "http://dafl.dk/content/news/pics/".$image['filename'];
$imageType = substr($imageUrl, strlen($imageUrl) - 3, 3);
$enclosedImage = '
<enclosure url="'.$imageUrl.'" length="0" type="image/'.$imageType.'" />
';
echo $enclosedImage;
}
and in the source code of the rss:
<enclosure url="http://dafl.dk/content/news/pics/13.png" length="0" type="image/png" />
The link to the rss is:
http://dafl.dk/rss/?language=en
(The picture is only included when an image is present for the newspost. Is this a problem - that not all items have an enclosure ?
Try this link to see if the enclosure tag works in the target browser.
http://www.w3schools.com/rss/tryrss.asp?filename=rss_ex_enclosure
References:
http://www.w3schools.com/rss/rss_tag_enclosure.asp
Related
I'd like to be able to grab data such as list of articles from yahoo finance. At the moment I have a local hosted webpage that searched yahoo finance for stock symbols (E.g Nok), It then returns the opening price, current price, and how far up or down the price has gone.
What I'd like to do is actually grab related links that yahoo has on the page - These links have articles related to the share price...E.g https://au.finance.yahoo.com/q?s=nok&ql=1 Scroll down to headlines, I'd like to grab those links.
At the moment I'm working off a book (PHP Advanced for the world wide web, I know it's old but I found it laying around yesterday and it's quite interesting :) ) In the book it says 'It's important when accessing web pages to know exactly where the data is' - I would think by now there would be a way around this...Maybe the ability to search for links that have a particular keyword in it or something like that!
I'm wondering if theres a special trick I can use to grab particular bits of data on a webpage?? Like crawlers, they are able to grab links that are related to something.
It would be great to know how to do this, then i'd be able to apply it to other subjects in the future.
Ill add my code that I have at the moment. This is purely for practise as I'm learning PHP in my course :)
##getquote.php
<!DOCTYPE html PUBLIC "-//W3// DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/2000/REC-xhtml1-20000126/DTD/xhtml1-transitional.dtd">
<html xmlns="https://www.w3.org/1999/xhtml">
<head>
<title>Get Stock Quotes</title>
<link href='css/style.css' type="text/css" rel="stylesheet">
</head>
<h1>Stock Reader</h1>
<body>
<?php
//Read[1] = current price
//read[5] = opening price
//read[4] = down or up whatever percent from opening according to current price
//Step one
//Begin the PHP section my checking if the form has been submitted
if(isset($_POST['submit'])){
//Step two
//Check if a stock symbol was entered.
if(isset($_POST['symbol'])){
//Define the url to be opened
$url = 'http://quote.yahoo.com/d/quotes.csv?s=' . $_POST['symbol'] . '&f=sl1d1t1c1ohgv&e=.csv';
//Open the url, if can't SHUTDOWN script and write msg
$fp = fopen($url, 'r') or die('Cannot Access YAHOO!.');
//This will get the first 30 characters from the file located in $fp
$read = fgetcsv ($fp, 30);
//Close the file processsing.
fclose($fp);
include("php/displayDetails.php");
}
else{
echo "<div style='color:red'>Please enter a SYMBOL before submitting the form</div>";
}
}
?>
<form action='getquote.php' method='post'>
<p>Symbol: </p><input type='text' name='symbol'>
<br />
<input type="submit" value='Fetch Quote' name="submit">
</form>
<br />
<br />
##displayDetails.php
<div class='display-contents'>
<?php
echo "<div>Todays date: " . $read[2] . "</div>";
//Current price
echo "<div>The current value for " . $_POST["symbol"] . " is <strong>$ " . $read[1] . "</strong></div>";
//Opening Price
echo "<div>The opening value for " . $_POST["symbol"] . " is <strong>$ " . $read[5] . "</strong></div>";
if($read[1] < $read[5])
{
//Down or Up depending on opening.
echo "<div>" .strtoupper($_POST['symbol']) ."<span style='color:red'> <em>IS DOWN</em> </span><strong>$" . $read[4] . "</strong></div>";
}
else{
echo "<div>" . strtoupper($_POST['symbol']) ."<span style='color:green'> <em>IS UP</em> </span><strong>$" . $read[4] . "</strong></div>";
}
added code to displayDetails.php
function getLinks(){
$siteContent = file_get_contents($url);
$div = explode('class="yfi_headlines">',$siteContent);
// every thing inside is a content you want
$innerContent = explode('<div class="ft">',$div)[0]; //now you have inner content of your div;
$list = explode("<ul>",$innerConent)[1];
$list = explode("</ul>",$list)[0];
echo $list;
}
?>
</div>
I just the same code in - I didn't really know what I should do with it?!
Idk for fgetcsv but with file_get_contents you can grab whole content of a page into a string variable.
Then you can search for links in string (do not use regex for html content search: Link regex)
I briefly looked at yahoo's source code so you can do:
-yfi_headlines is a div class witch wrappes desired links
$siteContent = file_get_contents($url);
$div = explode('class="yfi_headlines">',$siteContent)[1]; // every thing inside is a content you want
-last class inside searched div is: ft
$innerContent = explode('<div class="ft">',$div)[0]; //now you have inner content of your div;
repeat for getting <ul> inner content
$list = explode("<ul>",$innerConent)[1];
$list = explode("</ul>",$list)[0];
now you have a list of links in format: <li>text</li>
There are more efficient ways to parse web page like using DOMDocument:
Example
For getting content of a page you can also look at this answer
https://stackoverflow.com/a/15706743/2656311
[ADITIONALY] IF it is a large website: at the beggining of a function do: ini_set("memory_limit","1024M"); so you can store more data to your memory!
Hello I'm using Curl to get information from Wikipedia,and I want to receive only information about the principal image,I don't want to receive all images of an article..
For example..
If I want to get info about all images of the English Language (http://en.wikipedia.org/wiki/English_language) I should go to this URL:
http://en.wikipedia.org/w/api.php?action=query&titles=English_Language&prop=images
but I receive flags of countries where people speak English in XML:
<?xml version="1.0"?> <api> <query>
<normalized>
<n from="English_language" to="English language" />
</normalized>
<pages>
<page pageid="8569916" ns="0" title="English language">
<images>
<im ns="6" title="File:Anglospeak(800px)Countries.png" />
<im ns="6" title="File:Anglospeak.svg" />
<im ns="6" title="File:Circle frame.svg" />
<im ns="6" title="File:Commons-logo.svg" />
<im ns="6" title="File:Flag of Argentina.svg" />
<im ns="6" title="File:Flag of Aruba.svg" />
<im ns="6" title="File:Flag of Australia.svg" />
<im ns="6" title="File:Flag of Bolivia.svg" />
<im ns="6" title="File:Flag of Brazil.svg" />
<im ns="6" title="File:Flag of Canada.svg" />
I only want the information about the principal image.
There's news! (from 2014)
A new extension, PageImages, is available and also got already installed on the Wikimedia wikis.
Instead of prop=images, use prop=pageimages, and you'll get a pageimage attribute and a <thumbnail> child node for each <page> element.
Admittedly, it's not guaranteed to give the best results, but in your example (English Language) it works well and only yields the map of the geographic distribution, not all the flags.
Also, the OpenSearch API does return an <image> in it's xml representation, but this API is not usable with lists and cannot be combine with the Query API.
This is how I got it working...
$.getJSON("http://en.wikipedia.org/w/api.php?action=query&format=json&callback=?", {
titles: "India",
prop: "pageimages",
pithumbsize: 150
},
function(data) {
var source = "";
var imageUrl = GetAttributeValue(data.query.pages);
if (imageUrl == "") {
$("#wiki").append("<div>No image found</div>");
} else {
var img = "<img src=\"" + imageUrl + "\">"
$("#wiki").append(img);
}
}
);
function GetAttributeValue(data) {
var urli = "";
for (var key in data) {
if (data[key].thumbnail != undefined) {
if (data[key].thumbnail.source != undefined) {
urli = data[key].thumbnail.source;
break;
}
}
}
return urli;
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<html>
<head></head>
<body>
<div id="wiki"></div>
</body>
</html>
As others have noted, Wikipedia articles don't really have any such thing as a "principal image", so your first problem will be deciding how to choose between the different images used on a given page. Some possible selection criteria might be:
Biggest image in the article.
First image exceeding some specific minimum dimensions, e.g. 60 × 60 pixels.
First image referenced directly in the article's source text, rather than through a template.
For the first two options, you'll want to fetch the rendered HTML code of the page via action=parse and use an HTML parser to find the img tags in the code, like this:
http://en.wikipedia.org/w/api.php?action=parse&page=English_language&prop=text|images
(The reason you can't just get the sizes of the images, as used on the page, directly from the API is that that information isn't actually stored anywhere in the MediaWiki database.)
For the last option, what you want is the source wikitext of the article, available via prop=revisions with rvprop=content:
http://en.wikipedia.org/w/api.php?action=query&titles=English_language&prop=revisions|images&rvprop=content
Note that many images in infoboxes and such are specified as parameters to a template, so just parsing for [[Image:...]] syntax will miss some of them. A better solution is probably to just get the list of all images used on the page via prop=images (which you can do in the same query, as I showed above) and look for their names (with or without Image: / File: prefix) in the wikitext.
Keep in mind the various ways in which MediaWiki automatically normalizes page (and image) names: most notably, underscores are mapped to spaces, consecutive whitespace is collapsed to a single space and the first letter of the name is capitalized. If you decide to go this way, here's some sample PHP code that will convert a list of file names into a regexp that should match any of them in wikitext:
foreach ($names as &$name) {
$name = trim( preg_replace( '/[_\s]+/u', ' ', $name ) );
$name = preg_quote( $name, '/' );
$name = preg_replace( '/^(\\\\?.)/us', '(?i:$1)', $name );
$name = preg_replace( '/\\\\? /u', '[_\s]+', $name );
}
$regexp = '/' . implode( '|', $names ) . '/u';
For example, when given the list:
Anglospeak(800px)Countries.png
Anglospeak.svg
Circle frame.svg
Commons-logo.svg
Flag of Argentina.svg
Flag of Aruba.svg
the generated regexp will be:
/(?i:A)nglospeak\(800px\)Countries\.png|(?i:A)nglospeak\.svg|(?i:C)ircle[_\s]+frame\.svg|(?i:C)ommons\-logo\.svg|(?i:F)lag[_\s]+of[_\s]+Argentina\.svg|(?i:F)lag[_\s]+of[_\s]+Aruba\.svg/u
Important addendum
Bergi's answer, above, seemed super great, but I was bashing my head out because I couldn't get it to work.
I needed to include pilicense=any in my query, because otherwise any copyrighted imagery was ignored.
Here's the query I ultimately got working:
https://en.wikipedia.org/w/api.php?action=query&pilicense=any&format=jsonfm&prop=pageimages&generator=search&gsrsearch=My+incategory:English-language_films+prefix:My&gsrlimit=3
I know it's been awhile, but this is one of the first pages I landed on when I started my days-long search for how to do this, so I wanted to share this specifically on this page, for others like me who might come here.
You can limit your query to the first image in the article with the imlimit parameter:
http://en.wikipedia.org/w/api.php?action=query&titles=English_Language&redirects&prop=images&imlimit=1
I have a form for user input in which users can add images hosted elsewhere using a form. The images are displayed as a small icon which is both a link and which will load an image into a div with stacking order +1 on hover. The image source address is stored in the link tag only.
I am using a <div /> with contenteditable=true for the user input. The icon is appended when the form is used. The code for this part works fine. What I would like to do is check the source of all image tags to make sure that users are not adding their own html to display full size images in their post.
I am using php on the backend to remove all tags except links and images, but would like to use jQuery to check the src of the image tags before posting.
<img src="my_icon" /> //this is what my form will input
<img src="anything_else"> //this is what I want to prevent
Update: I apologize if this is not clear. Essentially, I don't want the user to be able to input any html of their own. If they want to add an image, they have to use my built in form which inserts something like above.
You could loop over the images and then check the src attribute.
$("img").each(function(index) {
if ($(this).attr("src") == ...) {
// do something
}
}
See http://api.jquery.com/each/ and http://api.jquery.com/attr/ for more information.
Say we have the <div id="editor" /> the jQuery script would look something like this:
var srcs = [];
jQuery ('div#editor img').each (function () {
srcs.push (jQuery (this).attr ('src'));
});
srcs will now hold all the src-attributes from the <img />-tags provided in the <div id="editor" /> tag.
Especially for your site following code alerts the links for all images:
$('.postimage').each(function(){
alert($(this).attr('href'));
});
I have an answer. When users input a < or > in a contenteditable="true" div, the browser replaces them with the html notation < and >. The good news is the images would have never displayed when outputting the user comment. The bad news is that the jQuery based solutions given above will not work to remove the ugly coding. I ended up using php to do it with
$post = $_POST['comment'];
$imgcheck = true;
$stringstart = 0;
while($imgcheck == 'true'){
if($stringstart = strpos($post,'<img',$stringstart)){
if ($stringend = strpos($post,'>',$stringstart)){
$strlength = $stringend - $stringstart +4;
$substring = substr($post,$stringstart,$strlength);
if (!preg_match('~src="\/images\/ImageLink.jpg"~',$substring)){
$post = str_replace($substring, "", $post);
}
else{
$stringstart = $stringend;
}
}
else{
$imgcheck = 'false';
}
}
else{
$imgcheck = 'false';
}
}
I've stored my Images into (Medium) BLOB fields and want to retrieve them embedded within my PHP-generated web pages.
When I test retrieving the stored images using
header('Content-type: ' . $image['mime_type']);
echo $image['file_data'];
everything looks just fine.
However, I have not yet found a way to retrieve the image(s) cleanly into the middle of my documents. For example, using
$image = $row['file_data'];
echo '<img src="data:image/jpeg;base64,'.$image['file_data'].'" alt="photo"><br>';
...or...
$im = imageCreateFromString($image);
I just wind up with a bunch of hexadecimal garbage on screen.
I intitially stored the Images using:
ob_start();
imagejpeg($resizedImage, null, 100);
$content = ob_get_contents();
ob_end_clean();
$sql = sprintf(
"insert into images (filename, mime_type, file_size, file_data, event_id)
values ('%s', '%s', %d, '%s',%d)",
mysql_real_escape_string($fileName),
mysql_real_escape_string($mimeType),
$imageSize,
mysql_real_escape_string($content),
$eventID
);
$result = $cn->query($sql);
Does anyone PLEASE have a working code snippet to successfully display the stored .jpg mid-file in the PHP output?
echo '<img src="data:image/jpeg;base64,'.base64_encode($image['file_data']).'" alt="photo"><br>';
However, remember that old IE versions do not support this kind of inline images! Besides that, the browser cannot cache such an image except together with its containing HTML page.
You should create some sort of "image server". You're already close to that.
For example, create something like image.php that will get a image name and will generate it on the fly.
So, for example, say you want to get somePic.jpg image. You can get it through:
image.php?name=somePic.jpg
<?php
header('Content-type: ' . $image['mime_type']);
echo $image['file_data'];
?>
Your tag:
<img src='image.php?name=somePic.jpg' />
Or more general:
echo "<img src='image.php?name={$image['filename']}' />"
Why not just call your test page image.php, then have it called from the browser on the rendered page:
<img src="image.php?imageid=123" alt="photo" />
I am trying to get the post link of a RSS feed. I load all the posts in an array correctly ( I successfully echo the content and other tags) but I have a problem to get the link.
In the feed, the link can be found by two ways
1.
<link rel="alternate" type="text/html" href="this is the address I want" title="here goes the title" />
and tried <?php echo $post->link[href]; ?> but because there are a lot of link tags in a content, it must echo the one that has rel="alternate"
2.
<feedburner:origLink>this is the address</feedburner:origLink>
and tried <?php echo $post->feedburner:origLink; ?>
My question is how to get the link ? I prefer the 2nd way because it does not go through the feedburner link.
Note: I use two RSS XML structures in the array so what I will use is something like this
($post->description)?$post->description:$post->content) as I do for the description/content
1. rel=alternate
$links = $post->xpath('link[#rel="alternate" and #type="text/html"]');
$link = (string) $links[0]['href'];
See http://php.net/simplexmlelement.xpath and http://php.net/simplexml.examples-basic (Example #5)
2. feedburner:origLink
$links = $post->xpath('feedburner:origLink');
$link = (string) $links[0];
// or
$link = (string) $post->children('feedburner', TRUE)->origLink;
See http://php.net/simplexmlelement.children
I had the same problem but I solved it with the follow:
$link = $xml->entry[$i]->link[2]->attributes()->href;
//the feed-blog has 3 type of links
where probably $xml is $post for you.