PHP + cURL gets file that shouldn't exist - php

This is my first question on here and I wouldn't ask if I haven't tried nearly everything in my might already. My problem is as follows:
I got tasked to maintain an old webapp in our company and part of it was to change an old date in a html file to the current year. I did that and when I access the file with the required parameters directly it works. Now the app itself doesn't access this file directly but loads it in through something they called "proxy.php" - it defines allowed hosts and some other stuff but the thing it mostly does is get data via cURL. If I access the file that I changed through the "proxy.php" it returns a file that shouldn't exist anywhere on the server (with old content in it).
I replicated proxy.php's function down below with the same result (an old file is served):
<?php
//phpinfo();
//Proxy.php test because weird things are happening...
$url = $_GET["url"]; //The url that it gets is escaped already.
$ch = curl_init();
curl_setopt($ch, CURLOPT_HTTPGET,1);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,TRUE);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
$xml = curl_exec ($ch);
echo $xml;
?>
I also got told by a colleague that they always only altered they date in the html file and it worked until now. I hope I provided enough information about my problem. Thank you in advance.

Just wanted to give a quick update as I have resolved the issue thankfully.
Deep in the configuration was an old proxy server that seemed to be serving cached files.
I changed the "proxy.php" to use a new proxy server:
$proxy = "your.proxy.server:port";
curl_setopt($ch, CURLOPT_PROXY, $proxy);
Thanks everyone for your advice :)

Related

Whats the correct way use PHP cURL for a GET request with query parameters and alternate port number? (for SOLR)

I'm trying to use PHP cURL to send a GET request to Apache Solr to receive search results, and I'm running into some trouble. This is more to do with PHP cURL I think than Solr...but i digress...here's what I have so far.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://example.com/solr/example/select?".$this->query);
curl_setopt($ch, CURLOPT_PORT, 8983);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_TIMEOUT, '4');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$data = json_decode(curl_exec($ch), TRUE);
curl_close($ch);
Currently, the request times out at 4 seconds...so I'm wondering if maybe the port# isn't being set properly...I also tried to include it in the URL itself with no luck. The weird part is that I can echo the constructed URL, add the port# manually, copy/paste into the browser and it works! But, for some reason it doesn't with the code above.
Any help would be greatly appreciated! Thanks in advance!
The above code actually works as intended...my issue was local firewall rules blocking remote connection to certain port #'s (as opposed to remote server firewall rules as skrilled suspected)

Setting up proxy for an axis ip camera in PHP

I have been searching for a way to proxy a mjpeg stream from the AXIS M1114 Network Camera.
using the following url setup
http://host:port/axis-cgi/mjpg/video.cgi?resolution=320x240&camera=1
i try to capture the output and make them available to users with a php script running an apache server on ubuntu.
having browsed the web looking for an answer to no avail i come to you.
my ultimate goal is to have users able to link to the proxy like this:
<img src='proxy.php'>
and have the details of all the things in proxy.php.
I have tried using the way of cURL (advised in similar thread here) but i can't get it to work, probably due to lack of knowledge on the inner workings.
currently my very simple proxy.php looks like this
<?php
$camurl = "http://ip:port";
$campath = "axis-cgi/mjpg/video.cgi";
$userpass = "user:pw";
$ch = curl_init();
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_URL, $camurl + $campath);
curl_setopt($ch, CURLOPT_POSTFIELDS, 'resolution=320x240&camera=1');
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
curl_setopt($ch, CURLOPT_USERPWD, $userpass);
$result = curl_exec($ch);
header('Content-type: image/jpeg');
echo $result;
curl_close($ch);
?>
My understanding is that this would produce an acceptable output for my plan. But alas.
My question would be if there is a blatant error i do not see. Any simpler option/way of getting the result i aim for is welcome too.
Please point me in the right direction. I happily provide any relevant information i might have missed to provide. Thank you in advance.
solved edit:
After commenting out:
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, true);
changing
curl_setopt($ch, CURLOPT_URL, $camurl + $campath);
to
curl_setopt($ch, CURLOPT_URL, $camurl . $campath); (mixing up some languages)
and most importantly removing a space in the .php file so that the header is actually the header it sort of does what i wanted.
Adding a
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
seems to be needed to get the image displayed as image and not as raw data.

Set up a php proxy to access censored websites and bypass firewall

I'm currently using this plugin http://wordpress.org/extend/plugins/repress/ which basically makes my website a proxy so that users can access censored websites like this
www.mywebsite.com/proxy/www.cnn.com
The plugin works well enough but when it comes to absolute links the plugin doesn't parse it properly and the link is still blocked. That plugin development has stopped. So I need to write my own script. I've been searching everywhere and reading up on the tutorials I can find but none specifically helps me in regards to this.
I know how to use php curl to fetch a website and echo it on a blank page. What I don't know is how to set a proxy script to work like the above example where users can type
www.mywebsite.com
followed by
/proxy.php
then their target website
/www.cnn.com
Currently I have this set up:
<?php
$url = 'http://www.cnn.com';
$proxy_port = 80;
$proxy = '92.105.140.115';
$timeout = 0;
$referer = 'http://www.mydomain.com'
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
curl_setopt($ch, CURLOPT_PROXYPORT, $proxy_port);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_PROXYTYPE, 'HTTP');
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 0);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);*/
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_REFERER, $referer);
$data = curl_exec($ch);
curl_close($ch);
echo $data;
?>
This pulls the home page but no css or images are retrieved. Likewise all relative links are broken. I have no idea how to apply the proxy_port and proxy variables. I tried
92.105.140.115:80/www.cnn.com
but that doesn't work. I don't quite fully understand this code either since I found it on an example site.
Any answer or links for tutorials is greatly welcome.
Thank You!
To have a completely functioning proxy isn't that simple. There are many such projects already available. Give any a shot:
http://www.surrogafier.info/
https://github.com/Alexxz/Simple-php-proxy-script
http://www.glype.com/
Have fun!
you can not just echo in a page the result of a curl's fetch website because the browsers will interpret the Uris bad, you need that when the user click on a link he goes to your proxy site not to the original site, so you can't just print with echo, you need to edit manual every link in that fetched page before print it to the users, i have a full functional proxy made by php en p.listascuba.com you cant try it.
contact me for more info

Copying an image with PHP

With permission from the other site, I am supposed to periodically download an image hosted on another site and include it in a collection of webcam photos on our site with an external link to the contributing site.
This hasn't been any issue for any of the other sites, but with this particular one, I can't open the image to resize it and save it to our server. It's not a hotlinking issue because I can create a plain ole' <img src="http://THEIRIMAGE /> on a page on our site and it works fine.
I've tried using $img = new Imagick($sourceFilePath) directly as with all the others, as well as trying to use PHP's copy and also trying to copy the image using cURL but when doing so, the page just times out with no results at all.
Here's the image in question: http://island-alpaca.selfip.com:10202/SnapshotJPEG?Resolution=640x480&Quality=Standard
Like I've said, I'm able to do this sort of thing with several other webcams, but it isn't working with this one, and I am stuck as to why it isn't. Any help would be greatly appreciated.
Thank you.
In order to reduce bandwidth and server load some sites block certain bots from accessing their content. Your cURL request needs to more closely mimic an actual browser, which would include a referrer (usually), user agent, etc. It could also be that there is a redirect and you haven't told cURL to follow redirects.
Try setting more opts like this:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch,CURLOPT_ENCODING , "gzip");
curl_setopt($ch, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_USERAGENT, 'PHP');
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
If that doesn't work, get the user agent from an actual browser and put it in there and see if that makes a difference.
Try using file_get_contents().
for example:
$url = 'http://island-alpaca.selfip.com:10202/SnapshotJPEG?Resolution=640x480&Quality=Standard';
$outputfile = "tmp/image" . date("Y-m-d_H.i.s");
$cmd = "wget -q \"$url\" -O $outputfile";
exec($cmd);
$temp_img = file_get_contents($outputfile);
$img = new Imagick($temp_img);
Can you try this and get back to me?

PHP / Curl: HEAD Request takes a long time on some sites

I have simple code that does a head request for a URL and then prints the response headers. I've noticed that on some sites, this can take a long time to complete.
For example, requesting http://www.arstechnica.com takes about two minutes. I've tried the same request using another web site that does the same basic task, and it comes back immediately. So there must be something I have set incorrectly that's causing this delay.
Here's the code I have:
$ch = curl_init();
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 20);
curl_setopt ($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
// Only calling the head
curl_setopt($ch, CURLOPT_HEADER, true); // header will be at output
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'HEAD'); // HTTP request is 'HEAD'
$content = curl_exec ($ch);
curl_close ($ch);
Here's a link to the web site that does the same function: http://www.seoconsultants.com/tools/headers.asp
The code above, at least on my server, takes two minutes to retrieve www.arstechnica.com, but the service at the link above returns it right away.
What am I missing?
Try simplifying it a little bit:
print htmlentities(file_get_contents("http://www.arstechnica.com"));
The above outputs instantly on my webserver. If it doesn't on yours, there's a good chance your web host has some kind of setting in place to throttle these kind of requests.
EDIT:
Since the above happens instantly for you, try setting this curl setting on your original code:
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, true);
Using the tool you posted, I noticed that http://www.arstechnica.com has a 301 header sent for any request sent to it. It is possible that cURL is getting this and not following the new Location specified to it, thus causing your script to hang.
SECOND EDIT:
Curiously enough, trying the same code you have above was making my webserver hang too. I replaced this code:
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'HEAD'); // HTTP request is 'HEAD'
With this:
curl_setopt($ch, CURLOPT_NOBODY, true);
Which is the way the manual recommends you do a HEAD request. It made it work instantly.
You have to remember that HEAD is only a suggestion to the web server. For HEAD to do the right thing it often takes some explicit effort on the part of the admins. If you HEAD a static file Apache (or whatever your webserver is) will often step in an do the right thing. If you HEAD a dynamic page, the default for most setups is to execute the GET path, collect all the results, and just send back the headers without the content. If that application is in a 3 (or more) tier setup, that call could potentially be very expensive and needless for a HEAD context. For instance, on a Java servlet, by default doHead() just calls doGet(). To do something a little smarter for the application the developer would have to explicitly implement doHead() (and more often than not, they will not).
I encountered an app from a fortune 100 company that is used for downloading several hundred megabytes of pricing information. We'd check for updates to that data by executing HEAD requests fairly regularly until the modified date changed. It turns out that this request would actually make back end calls to generate this list every time we made the request which involved gigabytes of data on their back end and xfer it between several internal servers. They weren't terribly happy with us but once we explained the use case they quickly came up with an alternate solution. If they had implemented HEAD, rather than relying on their web server to fake it, it would not have been an issue.
If my memory doesn't fails me doing a HEAD request in CURL changes the HTTP protocol version to 1.0 (which is slow and probably the guilty part here) try changing that to:
$ch = curl_init();
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 20);
curl_setopt ($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
// Only calling the head
curl_setopt($ch, CURLOPT_HEADER, true); // header will be at output
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'HEAD'); // HTTP request is 'HEAD'
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1); // ADD THIS
$content = curl_exec ($ch);
curl_close ($ch);
I used the below function to find out the redirected URL.
$head = get_headers($url, 1);
The second argument makes it return an array with keys. For e.g. the below will give the Location value.
$head["Location"]
http://php.net/manual/en/function.get-headers.php
This:
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
I wasn't trying to get headers.
I was just trying to make the page load of some data not take 2 minutes similar to described above.
That magical little options has dropped it down to 2 seconds.

Categories