Reading HTML file from URL - php

While most of the time I'd just use file_get_contents and CURL, I can't get it to work with a port in the URL. How can I read this file?
http://174.120.124.178:7800/7.html (It's a shoutcast statistics file)
Ultimately, I just want the text after the last comma.

It has nothing to do with the port. They're blocking you because you're not using a browser user agent. curl does let you fake the user agent, but that may be a violation of the site's terms of service.
According to this post it's not about blocking scripts, but just distinguishing between Shoutcast clients and everything else. So the code is:
curl_setopt($curl_handle, CURLOPT_USERAGENT, "Mozilla");

I tried to download your file with Curl on the command line and got a 404 error; it does load with Firefox and Lynx. This page says that you need to change the User-Agent string for it to download.

CURLOPT_PORT Needs to be set to the appropriate port perhaps~

Related

How To View All Secured & Unsecured Pages, Via Web Proxy, With SSL Verification?

I tested Mini Proxy but it showed a complete blank page when I tried viewing https pages. I was told to add these 2 lines after "$ch = curl_init();" and it worked and I am able to view https pages on my web proxy.
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
However reading a little bit more on cURL, I realize that, those 2 lines are risky as my web proxy would not verify the SSL certificates.
Q1. And so, what steps to follow and what lines of codes to add so that my web proxy verifies the SSL certificates using all the CAs that all modern browsers use to verify certificates and digital signatures ? What must I do now to add the certificate verification feature ?
Q2. I need to add a bad words filter so if a user tries viewing webpages that contain the bad words in it's content, then the web proxy must not load the webpage but must echo error instead. I will try writing and adding the filter if I can be sure to which line I should add the filter. Currently, I am stuck to which line it should be added on the Mini Proxy.
Mini Proxy's source code by Jost Dick, can be found on the following link from where I downloaded it: https://github.com/joshdick/miniProxy/blob/master/miniProxy.php
Can someone be kind enough to let me know on which line I should add the bad words filter and which variable I should be adding the filter on ?

Equivalent of iframe on server side

Is there any way to create an "iframe-like" on server side ? The fact is, I need to acceed to certains page of my society's intranet from our website's administration part.
I already have a SQL link to the database that works fine, but here I would access to the pages without duplicating the source code on the webserver.
My infrasructure is the following:
The Webserver is in a DMZ and has the following local IP: 192.168.63.10.
Our Intranet server is NOT in the DMZ and has the following IP: 192.168.1.20.
Our Firewall has serverals rules and I've just added the following:
DMZ->LAN Allow HTTP/HTTPS traffic and LAN->DMZ Allow HTTP/HTTPS (just as we've done for the SQL redirection)
I've tried the following PHP function:
$ch = curl_init();
// set URL and other appropriate options (also tried with IP adress instead of domain)
curl_setopt($ch, CURLOPT_URL, "http://intranet.socname.ch/");
curl_setopt($ch, CURLOPT_HEADER, 0);
// grab URL and pass it to the browser
curl_exec($ch);
// close cURL resource, and free up system resources
curl_close($ch);
I've also tried:
$page = file_get_contents('http://192.168.1.20/');
echo $page;
Or:
header('Location:http://192.168.1.20');
But in all thoses cases, it works fine from local but not from internet. From internet, it doesn't load and after a while, says that the server isn't responding.
Thanks for your help !
Your first and second solution could work. Can your webserver access 192.168.1.20? (try ping 192.168.1.20 on your webserver) or resolve the Hostname intranet.socname.ch ? (try nslookup intranet.socname.ch)
What you're looking for is called "proxy", here is a simple PHP project that I found:
https://github.com/Alexxz/Simple-php-proxy-script
Download the repo, copy example.simple-php-proxy_config.php to simple-php-proxy_config.php and change $dest_host = "intranet.socname.ch";
It should do the trick! (may also need to change $proxy_base_url)

curl_exec($ch) not executing on external domains anymore, why?

I was using cURL to scrape content from a site and just recently my page stated hanging when it reached curl_exec($ch). After some tests I noticed that it could load any other page from my own domain but when attempting to load from anything external I'll get a connect() timeout! error.
Here's a simplified version of what I was using:
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'http://www.google.com');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
$contents = curl_exec ($ch);
curl_close ($ch);
echo $contents;
?>
Here's some info I have about my host from my phpinfo():
PHP Version 5.3.1
cURL support enabled
cURL Information 7.19.7
Host i686-pc-linux-gnu
I don't have access to SSH or modifying the php.ini file (however I can read it). But is there a way to tell if something was recently set to block cURL access to external domains? Or is there something else I might have missed?
Thanks,
Dave
I'm not aware about any setting like that, it would not make much sense.
As you said you are on a remote webserver without console access I guess that your activity has been detected by the host or more likely it caused issues and so they firewalled you.
A silent iptables DROP would cause this.
When scraping google you need to use proxies for more than a few hand full of requests and you should never abuse your webservers primary IP if it's not your own. That's likely a breach of their TOS and could even result in legal action if they get banned from Google (which can happen).
Take a look at Google rank checker that's a PHP script that does exactly what you want using CURL and proper IP management.
I can't think of anything that's causing a timeout than a firewall on your side.
I'm not sure why you're getting a connect() timeout! error, but the following line:
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
If it's not set to 1, it will not return any of the page's content back into your $contents.

check existance of image on remote server with allow_url_fopen=off

I am working on a site which is fetching images from a cdn server that has allow_url_fopen in turned off means allow_url_fopen=off.
The problem is that before showing the image in main site we have to check the existence of image in cdn server. If image exists then we will show it otherwise not.
file_exists and file_get_contents will not work as they require allow_url_fopen=on.
Is there any other way to do it???
Any help will be appreciated.
Thanks
You can use cUrl:
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($curl);
if(empty($result)){die('No image');}
You can use the CURL library to get the file via HTTP method. If that is used , if the file exists, the status code will be success otherwise not found status code will be returned. Another options is to use the socket library function like fsockopen to connect to that server and access the files via FTP if that is enabled.
Go through the examples given
http://php.net/manual/en/function.fsockopen.php
http://php.net/manual/en/function.fsockopen.php

Is it a secure SSL data transmission? Or not?

I want to get an information from one website into a php script on another website via https. I read at www.php.net on the page of the fopen() function that this function supports HTTPS protocol.
But is it really secure SSL transmission? Is GET variable "private" value is visible on the network or not? Do I get $contents value securely?
$filename = 'https://www.somesite.com/page.php?private=45456762154';
$handle = fopen($filename , 'r');
$contents = stream_get_contents($handle);
fclose($handle);
You can check by using a tool such as Wireshark. This tool will intercept the network traffic and tell you which protocol it is travelling as, and allow you to inspect the packet contents. If it's unintelligible, it's SSL :-)
As an aside, if you're using a browser (which you're not), a similar tool is Fiddler to inspect the HTTP traffic your browser is seeing.
check out this answer, https URL with token parameter : how secure is it?
In short, it is bad idea to have secure params as GET variables because the URLs get logged at servers and gets passed around in Referer headers.

Categories