Using PHP I'm trying to download/save the following image:
http://www.bobshop.nl/catalog/product_image.php?size=detail&id=42428
When you load this image in a browser, you can see it, but when I try to download it using several different methods, I get an 1 KB file that says that the product could not be found on the server.
I tried this with both the file_put_contents and the curl way.
I even used the function get_web_page that I found somewhere on StackOverflow, to catch a possible redirect.
What else could be the reason that you can see the image in a browser, but no way to download it ?
UPDATE:
Thanks to an error that was thrown trying out the different answers, I just found out the real cause of the problem. Somewhere in the process of scraping the html, the URL got & instead of & . I replace these now and every other method works now too... thanks all!
I just implemented a simple way to download and store and it worked:
<?php
$fileContent = implode("",file("http://www.bobshop.nl/catalog/product_image.php?size=detail&id=42428"));
$fp = fopen("/tmp/image","w+");
fwrite($fp, $fileContent);
fclose($fp);
?>
Are you behind a proxy? This could be the problem (you are with proxy configured but php not) ;)
There is likely some kind of header checking that is being done with this PHP script to ensure that a browser is requesting the image and not someone trying to scrape their content. This can be forged (although after doing something like this I feel like I need to take a shower) with cURL. Specifically, curl_setopt():
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'User-agent: Some legitimate string'
));
To find out which headers need to be sent, you'll need to do some experimentation. If you have Google Chrome, you've probably used the Inspector (If you don't Firefox has similar addons, so you can use something like Firebug). If you request the image with Chrome, you can right click to inspect it. Go to the Network tab. Now refresh the page. The request to product_image.php should show up. If you click on it and click the Headers tab, you should see a list of headers sent. My browsers sends: User-Agent, Accept, Accept-Encoding, Accept-Language, and Accept-Charset.
Try combinations of these headers with valid values to see which ones need to be sent for the image to be returned. I'd bet that this site probably only checks User-agent so start with that one.
An important note: You should cache the result of this call, because it will be very suspicious if your server requests the image multiple times in rapid succession (say if many users on your site request the script that grabs this image). Also as an extra layer of anonymity, you might want to pick your User-agent from an array of valid ones so bobshop.nl thinks that all of the requests are coming from users behind a large network (like a college campus). You can find valid user agent strings on UserAgentString.com.
Related
Using PHP, I'm trying to download file from this link:
http://creator.zoho.com/DownloadFile.do?filepath=/1472044441814_Lighthouse.jpg&sharedBy=29184456.
I've tried everything like copy(), file_put_contents("img.jpeg",file_get_contents($url)), curl but none work.
What's happening is that they create a image file in my server but when I view it, it shows me all the html and css and stuff like this, when I view it on window previewer it says that it can preview the picture etc etc.
Can someone please what I'm doing wrong here. Thank you.
The site probably intentionally tries to make it as hard as possible.
Actually, there are 2 main ways to check this:
Checking the session ids, and allowing the image download only from logged in sessions of users allowed to see that picture,
Checking http referer.
The second is much more common.
Improve your http request to contain a valid, logged in session id and the referer what a real browser would provide. You can do this by checking the cookies and http request parameters of a regular browser. You can do very easily, for example, with the Firebug extension of the Firefox.
Let's say that I have a website:
www.anywhere.com/test.php
$source = $_SERVER['HTTP_REFERER'];
There's a website that links to mine.
www.nowhere.com/remote1.html
Link
And another that does the same thing
www.somewhere.com/remote2.html
Link
If I click on Link on www.somewhere.com/remote2.html, www.anywhere.com/test.php $source will be www.somewhere.com/remote2.html.
I was wondering if there would be any way that somewhere.com/remote2.html disguise itself such that when its link is clicked, anywhere.com/test.php $source will be another URL (for example www.nowhere.com/remote1.html)
If I were to have a webservice where I accept request from accredited websites, I would need to be assured that nobody is able to access my webservice by faking their URI with those from accredited websites.
Thanks
Any request header can be faked. These are sent by the client and can easily be spoofed. You should not rely on these values. There's no guarantee that it'll be accurate.
In PHP, you can use cURL to spoof it:
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Referer: http://some-accredted-website.com/',
));
There are even browser extensions / plugins that does this. For example, there's Modify Headers addon for Firefox. There are even online services like Fiddler that lets you alter these values.
Bottom line: never rely on $_SERVER['HTTP_REFERER'] on being accurate.
The referer is just a field in the HTTP head so you can either create a HTTP request manually, or use an application that lets you set any fields to any values, which is easier because you don't need to calculate lengths manually.
I have a project of video broadcasting in which i need to provide the downloading option. I have used Justin.tv api they send a url to download the video file when i hit that url i got 403 forbidden error. I have discussed this problem with their concerned person he replied:
Browsers will get the 403 error, you need to either proxy the file
through your server (by removing the User-Agent header) or tell users
to use a download manager.
Definately the latter one is not good idea. Now i am stucked at sending request without user agent headers how can i do this (using PHP). I have googled it but did not find anything helpful.
Necromancing this old thread, I dunno if the info in the comment by #ayman-safadi was accurate at the time it was posted. That was a quote from some other location. But now,to remove the user agent header you do this:
-H "User-Agent:"
Maybe you can have the "download" link point to an internal page that will make a cURL call to the actual Justin.tv link.
According to one of the comments:
FYI... unless you specifically set the user agent, no user agent will be sent in your request as there is no default value like some of the other options.
There are a lot more comments that might help.
I have a site that is using frames. Is it still possible from the browser for someone to craft post data for one of the frames using the address bar? 2 of the frames are static and the other frame has php pages that communicate using post. And it doesn't appear to be possible but I wanted to be sure.
No, it is not possible to POST data from the address bar. You can only initiate GET requests from there by adding params to the URL. The POST Body cannot be attached this way.
Regardless of this, it is very much possible to send POST requests to your webserver for the pages in a frame. HTTP is just the protocol with which your browser and webserver talk to each other. HTTP knows nothing about frames or HTML. The page in the frame has a URI, just like any other page. When you click a link, your browser asks the server if it has something for that URI. The server will check if it has something for that URI and respond accordingly. It does not know what it will return though.
With tools like TamperData for Firefox or Fiddler for IE anyone can tinker with HTTP Requests send to your server easily.
Any data in the $_REQUEST array should be considered equally armed and dangerous regardless of the source and/or environment. This includes $_GET, $_POST, and $_COOKIE.
POST data can not be added in the address bar.
You should always check & sanitize all data you get in your PHP code, because anyone could post data to all of your pages.
Don't trust data from outside of your page. Clean it & check it.
Maybe not from the browser, but they can still catch the request (tinker with it) and forward it to the provided destination, with a tool like burp proxy.
To answer your question: No, it is not possible to send post data using the addressbar.
BUT it is possible to send post data to any url in a snap. For example using cURL, or a Firefox extension. So be sure to verify and sanitize all the data you receive no matter if POST or GET or UPDATE or whatever.
This is not iFrame or php specific, so it should be considered in every webapplication. Never ever rely on data send by anyone being correct, valid or secure - especially when send by users.
Yes, they absolutely can, with tools like Firebug, and apparently more specialized tools like the ones listed by Gordon. Additionally, even if they couldn't do it in the browser from your site, they could always create their own form, or submit the post data through scripting or commandline tools.
You absolutely cannot rely on the client for security.
On a website, I enter some parameters in a form, click on search and then get a page with a message "retrieving your results". After the search is complete, I get another page with my results displayed.
I am trying to recreate this programatically and I used Live HTTP Headers to get a peek of what is going on behind i.e the url, form variables,etc. However, I'm only getting information of what goes on up to the page which shows "retrieving your results". Live HTTP Header is not giving me information up to the page which contains the final results.
What can I do to get this final bit of information (i.e the url, form variables, etc)
I use Charles HTTP Proxy for all my HTTP troubleshooting needs. It has a ton of options and works with any browser.
"Web Developer" does this:
https://addons.mozilla.org/en-US/firefox/addon/60
#Mark Harrison
I have webdeveloper installed. Initially, I used it to turn off meta-redirects and referrers to get a clearer picture of the http interaction. But when i do this, the website does not work (i.e it is not able to complete the process of retrieving my search results) so i turned it back on.
I'm wondering if anyone has had to capture http information for a site that has a processing page in between the user input page and the results page
That sounds weird? I'm pretty sure that LiveHttpHeaders should show this. Can you double check that you aren't missing something? Otherwise try with Firebug. It has a tab for "network", which shows all requests made.
I'm using Fiddler2, which is a free (as in beer), highly configurable proxy; works with all browsers, allows header inspection/editing/automodification on request/response.
Disclaimer: I'm in no way affiliated with Fiddler, just a (very happy) user.
I for such problems always fire-on an Ethereal or similar network spying tool, to see exactly, what is going on.
The document is creating a browser component called XMLHTTPRequest , on submit event the object method send() is called, during the waiting time for server response an html element is replaced with a "Waiting message" on succesfull response a callback is called with the new html elements and then inserted in the selected html element. (That's called ajax).
If you want to follow that process you can use Firefox Live HTTP Headers Extension , or Wireshark to view full HTTP headers and actions (get/post/).