http://paultan.org/feed/
In above feed I can get the img ur like http://s2.paultan.org/image/2014/12/Theo-top-10-fave-renders-108x108.jpg
but in feedly.com I'm seeing other link of image, which have bigger size. I wonder how feedly can retrieve the 's content since it's not in the feed's DOM.
Would this link represent the large image that you see?
http://s1.paultan.org/image/2014/12/top-10-posts.jpg
feedly.com may be reading the destination page once and save it in their database. From the header you have a Facebook entry as follow:
<meta content="http://s1.paultan.org/image/2014/12/top-10-posts.jpg"
property="og:image">
which feedly.com would be using to present the image. This, of course, means a lot more bandwidth used in order to be able to display such information. They may also limit it to websites that have such headers (so they test once or twice, if the website does not offer a link like the og:image, they stop testing that website altogether, at least for a while...)
Related
My MODX site needs to grab the first image from all pages for Open Graph Meta tags, which will be plugged into the Head chunk for all templates.
The problem with this is that not all images are located in the content part of a page. Some are located inside Chunks and others inside TVs. (Finding an image tag from the content is not an issue.)
It might be possible to get all Chunks and TVs and loop through their values to check for images.
But is there a way to get the <body> contents of the resource?
Probably several ways, you can try writing a plugin to parse through the entire content of a page, looks like the OnWebPageComplete event may be the one to use (take a look at the different events to see if one is more appropriate}
You can try and grab the resource from the cache, keeping in mind that any chnks/snippets/TVs called in the page un-cached will not show up in the resource cache file.
You can get a list of TVs once you have loaded a resource & then use getTVValue to get the value.
If you have an image in a chunk, getChunk might work [might, I've never tried to use it that way] to get it's contents but I would image that the image in a chunk would come from a TV ~ so you should be able to retrieve it with getTVValue.
You could also just setup a TV for the OpenGraph image and explicitly set it on a page by page basis.
Probably writing a plugin & some regex is going to be the least painful way of going about it.
Is it possible to put PHP code into raw images?
For example:
http://gifsec.com/wp-content/uploads/GIF/2014/05/GIF-When-white-guys-dance.gif
If you go to that url you'll just see the raw image on a white page. Is it possible to somehow put code into this raw page? For example, you may want to put Google analytics tracking into raw image files so you can track people on reddit sharing raw files.
Not that I know of, what you may want to consider is having people share the link to that file so they can download it and then put code into the page that link redirects to that tracks or counts visitors. Tracking the visitor is harder and leads into ethical issues, so I would just set up google analytics and put their code into that page.
No,
http://gifsec.com/wp-content/uploads/GIF/2014/05/GIF-When-white-guys-dance.gif
is a resource on your server. that URL simply directs the browser to where the image is stored on the server.
to achieve what you want. simply create a page and include the image into
http://gifsec.com/GIF-When-white-guys-dance
<img src=''> on this page you can then add your Google analytic code.
Images are transferred from server to browser with binary encoding. this is why it will not work how you are thinking
You can hide anything you want in an image file. This is called steganography. The problem is that the code won't be executed unless it's uploaded to a server that is specifically set up to extract and run it.
It's not silly, just difficult. What you would have to do is use a PHP script to process it back. As such, your dance.gif would become dance.php and you would link to that. It will add some overhead to your server to do this so just be aware, however, this would allow you to track it via PHP. You could then import that data into Google Analytics at a later date.
Here's some pseudo code (we'll call this dance.php)
<?php
//Insert some tracking here, like a Database INSERT statement
$img = imagecreatefromgif('/path/to/dance.gif');
header('Content-Type: image/gif');
imagegif($image);
imagedestroy($image);
Then in your HTML
<img src="dance.php">
What you need is called pixel tracking also called web bug.
Take a look at this answer:
https://stackoverflow.com/a/13079838/797495
I am building a "Reddit" like site.
The User can post an URL from which I want to get the correct image with PHP.
What I would need is a script which sites like Facebook or Tumblr use to fetch the Images.
I saw already scripts which get the images by getting the HTML Content and searching for "img" tags.
Are there any better methods/scripts available?
Maybe even scripts which will order the images by the size: The bigger the image the more important it is.
Thanks for answers
You may want to check out PHPQuery, it will allow you easily iterate through all images on a given website. You can then work out the areas of each image and sort them accordingly.
It depends a bit for what you're looking for and what the image is that the user would like to have with his post. To give you an example: I once wrote a method that searches for a logo of a company on the company's website. To do so, I searched for, indeed, the img-tags using simple_html_dom and filtered those tags on the existence of logo in the alt-tag. The results are displayed to the user to select the right image; it could be that you find multiple images fitting your purpose.
I would indeed, as you proposed, have a look at the size and skip small images (e.g. smaller dan lets say 50 px).
I am looking for some help in downloading pics from a website. Here is the problem detail.
URL is basvandenbroek dot com,
suppose when we visit the following page http://www.basvandenbroek.com/nl/product/27341/704/snaarinstrumenten/boston/snarenset_elektrisch.html
we have a thumbnail pic here which when click bring its larger version. I would like to capture the larger image using a php script and download it onto my pc.
Problem is when we inspect the HTML we see the following code for images
../../../../../../../jpg/27000/27341.jpg
../../../../../../../jpg/cache/27000/220_220_27341.jpg
Based on the above code i assume that if i append website address at the start of the
jpg/27000/27341.jpg I could access the pic but its not working it that.
I believe URL is hidden or I might not understanding things properly. I am new to PHP and Scripting and I would like somebody to help me through it situation.
Thank you
For the website you mentioned, if the thumbnail is
http://www.basvandenbroek.com/jpg/cache/27000/220_220_27341.jpg
then the
http://www.basvandenbroek.com/jpg/27000/27341.jpg
So the thumbnail is basically the dimensions (220 x 220) added as a prefix to the original in a different folder. Also, there is nothing like hidden URL. Any link that is valid on a web-page is sure to appear in the source of the html. In chrome and firefox, atleat, you can find this link by right-clicking the link and copying the link address.
In your case you can find the thumbnail's url by right-clicking the thumbnail and the original's url by right-clicking it.
However, if you want to do this automatically using PHP, you will have to write code that can parse the html for the page to determine the urls.
In your example, here would be the larger image:
http://www.basvandenbroek.com/jpg/27000/27341.jpg
The smaller image is at:
http://www.basvandenbroek.com/jpg/cache/27000/220_220_27341.jpg
This means you would need to scrape out the first two underscored parts of the name (220_220) using string manipulation. You would also want to string replace "cache/" with an empty string.
relative urls are relative to the url of the containing document. so if the document you're scraping is located at http://example.com/foo/bar/baz/doc.html, and the image is referenced as
../../omg/wtf/lol/cat.jpeg, its full url is http://example.com/foo/bar/baz/../../omg/wtf/lol/cat.jpeg, or http://example.com/foo/omg/wtf/lol/cat.jpeg.
btw, this has nothing to do with PHP or scripting in general, and is instead firmly a HTTP thing. and there are no "hidden" URLs in HTTP, that would be a contradiction.
edit: your comment makes it look like the problem is with the Referer header or session id sent (or not) in your request.
I wonder if anyone can point me in the right direction.
I have a rather large spreadsheet of product info that needs plugging into a shop. The tricky bit is that the spreadsheet has a link which points to the relevant page on another site which has the products details, and what i need to do is grab that relevant Image and save locally, so I can use later.The reason Why Im thinking down this line is there are 7500 products....
My friend suggested I could maybe use php & filepopen.
The image does have an outer tag ID which I can refer to.
I was thinking of iterating through the spreadsheet this is the type of link I have to work with
http://www.apc.com/resource/include/techspec_index.cfm?base_sku=APCRBC105
the images themselves are called something random, but I figured I could rename them as I grab them to the more relevant SKU number.
so iterate through the spreadsheet by SKU number
identify the image by the relevant id on the page (I'm assumming it's
in the same place on every page)
save the image while renaming to the correct SKU number
Any ideas on how I could go about this ? the thought of visiting each page manually and saving the image 7500 times doesn't seem the best way forward!
Thanks for looking
Rip the base_sku from your links.
APCRBC105
Then use curl to fetch the image page
http://www.apc.com/products/moreimages.cfm?partnum=APCRBC105
Rip the image link with a regex epression on :
<div align="center">
<img align="center" src="http://www.apcmedia.com/resource/images/500/Front_Left/35531838-5056-9170-D33F24AE47742E6C_pr.jpg" />
</div>
Then use curl again to rip the actual image and save it.
That should work..
If there aren't any issues regarding copyrighted material, take a look at Google Refine.
You can grab content from websites based on your cell values and use them afterwards to build more complex scenarios.
See the screencasts for more info (screencast 3 talks about fetching values via URLs).
Once you have the Image URL's in your spreadsheet, it should be fairly easy to fetch them via curl or similar.