$fileSource = "http://google.com";
$ch = curl_init($fileSource);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_exec($ch);
$retcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if ($retcode != 200) {
$error .= "The source specified is not a valid URL.";
}
curl_close($ch);
Here's my issue. When I use the above and set $fileSource = "http://google.com"; it does not work, whereas if I set it to $fileSource = "http://www.google.com/"; it works.
What's the issue?
One permanently redirects (301) to the www. domain, while the other one just replies OK (200).
Why are you only considering only the 200 status code as valid? Let CURL handle that for you:
curl_setopt($ch, CURLOPT_FAILONERROR, true);
From the manual:
TRUE to fail silently if the HTTP code returned is greater than or
equal to 400. The default behavior is to return the page normally,
ignoring the code.
Try explicitly telling curl to follow redirects
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
If that doesn't work you may need to spoof a user agent on some sites.
Also, if they are using JS redirects your out of luck.
What you're seeing is actually a result of a 301 redirect. Here's what I got back using a verbose curl from the command line
curl -vvvvvv http://google.com
* About to connect() to google.com port 80 (#0)
* Trying 173.194.43.34...
* connected
* Connected to google.com (173.194.43.34) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.25.0 (x86_64-apple-darwin11.3.0) libcurl/7.25.0 OpenSSL/1.0.1b zlib/1.2.6 libidn/1.22
> Host: google.com
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Location: http://www.google.com/
< Content-Type: text/html; charset=UTF-8
< Date: Fri, 04 May 2012 04:03:59 GMT
< Expires: Sun, 03 Jun 2012 04:03:59 GMT
< Cache-Control: public, max-age=2592000
< Server: gws
< Content-Length: 219
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: SAMEORIGIN
<
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
here.
</BODY></HTML>
* Connection #0 to host google.com left intact
* Closing connection #0
However, if you do a curl on the actual www.google.com suggested in the 301 redirect, you'll get the following.
curl -vvvvvv http://www.google.com
* About to connect() to www.google.com port 80 (#0)
* Trying 74.125.228.19...
* connected
* Connected to www.google.com (74.125.228.19) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.25.0 (x86_64-apple-darwin11.3.0) libcurl/7.25.0 OpenSSL/1.0.1b zlib/1.2.6 libidn/1.22
> Host: www.google.com
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Fri, 04 May 2012 04:05:25 GMT
< Expires: -1
< Cache-Control: private, max-age=0
< Content-Type: text/html; charset=ISO-8859-1
I've truncated the remainder of google's response just to show the primary difference in the 200 OK vs 301 REDIRECT
Related
I am writing a PHP webapplication that has to connect to a webservice using Kerberos 5 authentication (Active Directory). My PHP website is hosted on IIS 7.5 with PHP 5.5. The application pool is running under the account that is authorized in Active Directory and for the target webservice.
I tried every example code that I could find on this site and other sites but to no avail.
This is the PHP code I am using now:
$url = 'http://mywebservice/login/kerberos';
$ch = curl_init();
$options = [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_VERBOSE => true,
CURLOPT_HTTPAUTH => CURLAUTH_GSSNEGOTIATE,
CURLOPT_HTTPHEADER => ['Authorization: Negotiate'],
CURLOPT_RETURNTRANSFER => true,
CURLOPT_USERPWD => 'myuser',
CURLOPT_URL => $url,
CURLOPT_HEADER => 1
];
curl_setopt_array( $ch, $options);
$result = curl_exec($ch);
$header_size = curl_getinfo($ch, CURLINFO_HEADER_SIZE);
$header = substr($result, 0, $header_size);
$body = substr($result, $header_size);
print $result;
This gives me the following message:
HTTP/1.1 302 Found Date: Fri, 21 Oct 2016 14:49:15 GMT X-Robots-Tag: noindex,nofollow WWW-Authenticate: Location: http://mywebservice/login?login_fail Content-Length: 0
When I remove the CURLOPT_HTTPHEADER => ['Authorization: Negotiate'] l get an Internal Server error from the curl module.
When I use curl commandLine I get the following result:
curl --negotiate http://mywebservice/login/kerberos -umyuser#mydomain --verbose -c "c:\cookie.txt" -b "c:\cookie.txt"
Enter host password for user 'myuser#mydomain':
* Trying (192.168.1.1...
* Connected to mywebservice (192.168.1.1) port 80 (#0)
> GET /login/kerberos HTTP/1.1
> User-Agent: curl/7.41.0
> Host: mywebservice
> Accept: */*
> Cookie: JSESSIONID_PUBLIC=X(MASKED)XXXXXXXXXXXXXXXXX
>
< HTTP/1.1 401 Unauthorized
< Date: Fri, 21 Oct 2016 14:52:46 GMT
< X-Robots-Tag: noindex,nofollow
< WWW-Authenticate: Negotiate
< Expires: Thu, 01 Jan 1970 00:00:00 GMT
< Last-Modified: Thu, 20 Oct 2016 14:52:46 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< P3P: CP=CAO PSA OUR
< Content-Type: text/html; charset=UTF-8
< Content-Length: 3643
<
* Ignoring the response-body
* Connection #0 to host mywebservice left intact
* Issue another request to this URL: 'http://mywebservice/login/kerberos'
* Found bundle for host mywebservice: 0xXXXXXXX
* Re-using existing connection! (#0) with host mywebservice
* Connected to mywebservice (192.168.1.1) port 80 (#0)
* Server auth using Negotiate with user 'myuser#mydomain'
> GET /login/kerberos HTTP/1.1
> Authorization: Negotiate X(MASKED)XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXD
w==
> User-Agent: curl/7.41.0
> Host: mywebservice
> Accept: */*
> Cookie: JSESSIONID_PUBLIC=X(MASKED)XXXXXXXXXXXXXXXXX
>
< HTTP/1.1 302 Found
< Date: Fri, 21 Oct 2016 14:52:46 GMT
< X-Robots-Tag: noindex,nofollow
< WWW-Authenticate:
< Location: http://mywebservice/?login_fail
< Content-Length: 0
<
* Connection #0 to host mywebservice left intact
When I test with the KerberosAuthenticationTester tool (http://blog.michelbarneveld.nl/michel/archive/2009/12/05/kerberos-authentication-tester.aspx) it authenticates me right away when I pass the url and credentials.
I assume that it is not working because I am missing the krb5 library. I could not find it as a DLL so I tried recompiling it with the PHP source in Visual Studio. This is not working for me as well, I am missing the config.w32 file. If necessary I can elaborate on that but first I want to know if this is really needed.
I also installed MIT Kerberos but this did not help aswell.
Is it correct that I need the krb5 DLL, or am I on the wrong track? If I need this DLL, where can I get it or how can I compile it? If there is another solution I would be very happy to hear it.
Thanks everyone for taking your time for me and replying!
Hi guys today I have a interesting question.
What's the best way to make a webservices in joomla?
I'm trying to make a web services in joomla and I have following problem:
in the controller of the view: components/com_webservice/view/view.json.php
<?php
defined('_JEXEC') or die('Restricted access');
jimport('joomla.application.component.view');
class WebServicesViewServices extends JViewLegacy {
private $data;
function __construct($config = array()) {
JLoader::import('models.services', JPATH_COMPONENT);
$model = new WebServicesModelServices();
if ($model->errors) {
echo json_encode($model->errors);
jexit();
}else{
$this->data = array('iphone' => '5s','iphone' => '6','iphone' => '6s','iphone' => '6s plus');
}
parent::__construct($config);
}
function display($tpl = null) {
echo json_encode($this->data);
}
}
?>
The problem is, if I execute: curl http://wsn.jserver/index.php?option=com_services&format=json
to consume this services, this response me
* Connected to wsn.jserver (127.0.0.1) port 80 (#0)
> GET /index.php?option=com_jserver HTTP/1.1
> Host: wsn.jserver
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 303 See other
< Date: Fri, 23 Oct 2015 02:40:37 GMT
< Server: Apache/2.4.16 (Unix) PHP/5.5.30
< X-Powered-By: PHP/5.5.30
< Set-Cookie: 4dbb8abeb5e7919ee73c8545901d5f62=d6ksd6e93t99q7hsk8cf10hq35; path=/; HttpOnly
< Set-Cookie: e909c2d7067ea37437cf97fe11d91bd0=DO
< Location: http://wsn.jserver/index.php?lang=es
< Content-Length: 0
< Content-Type: text/html; charset=utf-8
<
* Connection #0 to host wsn.jserver left intact
How can I do that this work?
what is the best way to make webservices in joomla?
Have you checked the JoomlaTools Framework? From the linked page:
Designed around the HTTP protocol. Each component automatically provides a level 3 JSON REST API out of the box, no extra coding required.
Solved!
I've found the problem.
The problem is because joomla to do a redirect to select language by default.
In my case, I've been duplicate the plugin languagefilter and validated that no be the option="com_services"
then when I execute the command "curl -v http://wsn.jserver/index.php?option=com_webervices" the response is:
* Trying 127.0.0.1...
* Connected to wsn.pawad (127.0.0.1) port 80 (#0)
> GET /index.php?option=com_pawaservices HTTP/1.1
> Host: wsn.jserver
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Fri, 23 Oct 2015 18:48:09 GMT
< Server: Apache/2.4.16 (Unix) PHP/5.5.30
< X-Powered-By: PHP/5.5.30
< Set-Cookie: 4dbb8abeb5e7919ee73c8545901d5f62=6a8m8cdte288k3jp2kvefmfe07; path=/; HttpOnly
< Content-Length: 16
< Content-Type: text/html
<
* Connection #0 to host wsn.pawad left intact
{"iphone":"5s", "iphone":"6", "iphone":"6s", "iphone":"6s plus"}
In conclusion, the problem is because joomla to do a redirect. to solved this problem you can hack the plugin languagefilter: plugins/system/languagefilter/languagefilter.php or create a new plugin.
I have a simple question to ask. I am using CURL to send data using post fields to my destination server. I am using the following code,
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL, $url);
curl_setopt($ch,CURLOPT_POST, count($_POST));
curl_setopt($ch,CURLOPT_POSTFIELDS, $postStr);
$result = curl_exec($ch);
curl_close($ch);
And then i am capturing the data using
$post = $_POST;
So is there anyway i can determine from which server the curl data is coming from? Since its coming from 5 different servers. I want to capture the URL of the source from which postfields are being sent. Thanks
Use the verbose mode (-v) on the cURL program.
Example:
C:\curl www.google.com
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
here.
</BODY></HTML>
Or:
C:\curl -v www.google.com
* Rebuilt URL to: www.google.com/
* Trying 216.58.222.4... <<< --- <<< --- YOUR SERVER IS THIS ONE
* Connected to www.google.com (216.58.222.4) port 80 (#0)
> GET / HTTP/1.1
> Host: www.google.com
> User-Agent: curl/7.42.0
> Accept: * / * <
< HTTP/1.1 302 Found
< Cache-Control: private
< Content-Type: text/html; charset=UTF-8
< Location: http://www.google.com.br/?gfe_rd=cr&ei=A76WVZvWCO-p8wf334HIDg
< Content-Length: 262
< Date: Fri, 03 Jul 2015 16:53:23 GMT
< Server: GFE/2.0
< Alternate-Protocol: 80:quic,p=0
<
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
here.
</BODY></HTML>
* Connection #0 to host www.google.com left intact
I am trying to download xml file from one polish website. For first days it worked but then I could download this file to my server (but I can open and download it on my computer). In file on my server in which there should be xml content is html content telling me that I have been blocked.
I was trying to contact with webmaster from website from which I want to get xml and he told me that I am not blocked by IP address. So the question is what I should sent in headers or what to download this file?
My code to download xml file is below and here is the xml which I want to download: http://www.polskatimes.pl/rss/fakty_kraj.xml
$headers[] = "User-Agent:Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13";
$headers[] = "Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$headers[] = "Accept-Language:pl-PL,pl;q=0.8";
$headers[] = "Accept-Encoding:gzip,deflate,sdch";
$headers[] = "Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$headers[] = "Keep-Alive:115";
$headers[] = "Connection:keep-alive";
$headers[] = "Cache-Control:max-age=0";
$xml_data = file_get_contents($xml,false,stream_context_create(
array("http" => array('header' => $headers)))); // your file is in the string "$xml" now.
file_put_contents($xml_md5, $xml_data); // now your xml file is saved.
Request the URL in verbose mode (-v):
* About to connect() to www.polskatimes.pl port 80 (#0)
* Trying 195.8.99.38... connected
* Connectede to www.polskatimes.pl (195.8.99.38) port 80 (#0)
> GET /rss/fakty_kraj.xml HTTP/1.1
> User-Agent: curl/7.21.0 (x86_64-pc-linux-gnu) libcurl/7.21.0 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.15 libssh2/1.2.6
> Host: www.polskatimes.pl
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx
< Date: Thu, 18 Apr 2013 10:40:15 GMT
< Content-Type: text/html; charset=utf8
< Transfer-Encoding: chunked
< Connection: close
< Vary: Accept-Encoding
< Expires: Thu, 18 Apr 2013 10:40:15 GMT
< Cache-Control: max-age=0
(html page with message that I am temporary blocked)
* Closing connection #0
To inspect what happens behind the scene (and which headers you actually need or not) you need to analyze a little. That is nothing magic, you can do it on the commandline with a software called curl. It is available for many (even all?) computer platforms.
First step most often is to request the URL in verbose mode (-v):
$ curl -v http://www.polskatimes.pl/rss/fakty_kraj.xml
* About to connect() to www.polskatimes.pl port 80 (#0)
* Trying 195.8.99.38... connected
* Connected to www.polskatimes.pl (195.8.99.38) port 80 (#0)
> GET /rss/fakty_kraj.xml HTTP/1.1
> User-Agent: curl/7.21.1 (i686-pc-mingw32) libcurl/7.21.1 OpenSSL/0.9.8r zlib/1.2.3
> Host: www.polskatimes.pl
> Accept: */*
>
< HTTP/1.1 302 Found
< Date: Wed, 17 Apr 2013 17:39:51 GMT
< Server: Apache
< Set-Cookie: sprawdz_cookie=1; expires=Thu, 17-Apr-2014 17:39:51 GMT
< Location: http://www.polskatimes.pl/rss/fakty_kraj.xml?cookie=1
< Vary: Accept-Encoding
< Content-Length: 0
< Connection: close
< Content-Type: text/html; charset=iso-8859-2
<
* Closing connection #0
That shows you the request (prefixed with >) and response (prefixed with <) headers and the response body (empty in this case). As you can see the status is 302 Found which means as 3xx a redirect and the location header tells where to:
Location: http://www.polskatimes.pl/rss/fakty_kraj.xml?cookie=1
As the query parameter suggests, this is a cookie-check. The cookie itself is set as well:
Set-Cookie: sprawdz_cookie=1; expires=Thu, 17-Apr-2014 17:39:51 GMT
So in the next step we will replay the last command but this time setting the cookie which can be done with the -b argument:
$ curl -v -b prawdz_cookie=1 http://www.polskatimes.pl/rss/fakty_kraj.xml
* About to connect() to www.polskatimes.pl port 80 (#0)
* Trying 195.8.99.38... connected
* Connected to www.polskatimes.pl (195.8.99.38) port 80 (#0)
> GET /rss/fakty_kraj.xml HTTP/1.1
> User-Agent: curl/7.21.1 (i686-pc-mingw32) libcurl/7.21.1 OpenSSL/0.9.8r zlib/1.2.3
> Host: www.polskatimes.pl
> Accept: */*
> Cookie: prawdz_cookie=1
>
< HTTP/1.1 200 OK
< Date: Wed, 17 Apr 2013 17:43:52 GMT
< Server: Apache
< Set-Cookie: sesja_gratka=e38fa0eb93705c8de7ae906198494439; expires=Wed, 24-Apr-2013 17:43:52 GMT; path=/; domain=polskatimes.pl
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< Vary: Accept-Encoding
< Connection: close
< Transfer-Encoding: chunked
< Content-Type: text/xml; charset=utf-8
<
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title><![CDATA[Fakty - Kraj]]></title>
<link>http://www.polskatimes.pl/fakty/kraj/</link>
<atom:link href="http://www.polskatimes.pl/rss/fakty_kraj.xml" rel="self" type="application/rss+xml"/>
<description><![CDATA[Materiały z działu Kraj]]></description>
... (cutted)
So this is immediately successful. And now the real good part: You know that you need to set the cookie for the request and curl shows you already all headers it used:
> GET /rss/fakty_kraj.xml HTTP/1.1
> User-Agent: curl/7.21.1 (i686-pc-mingw32) libcurl/7.21.1 OpenSSL/0.9.8r zlib/1.2.3
> Host: www.polskatimes.pl
> Accept: */*
> Cookie: prawdz_cookie=1
Most of them you do not need to care about with file_get_contents, the first line as well as the Host: and the Accept: line.
The User-Agent: header does not look like it really plays a role as curl is accepted.
So all what is left is the Cookie: header. Let's try in PHP:
$ php -r "echo file_get_contents('http://www.polskatimes.pl/rss/fakty_kraj.xml', null,
stream_context_create(['http'=>['header'=>['Cookie: prawdz_cookie=1']]]));"
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title><![CDATA[Fakty - Kraj]]></title>
<link>http://www.polskatimes.pl/fakty/kraj/</link>
<atom:link href="http://www.polskatimes.pl/rss/fakty_kraj.xml" rel="self"
type="application/rss+xml"/>
... (cutted)
And this is the direct test that only the Set-Cookie: prawdz_cookie=1 header is needed.
I'm retrieving data from an URL using curl.
Everything works fine if the php code is called via a HTTP request or if the URL is entered in Firefox. If the very same code is executed from a PHP CLI script curl_exec returns false. The error message is "Failure when receiving data from the peer".
Any ideas why curl is not working?
When I set the curl output to verbose I get:
Setting curl to verbose gives:
< HTTP/1.1 200 OK
< Server: Apache-Coyote/1.1
< Last-Modified: Mon, 01 Aug 2011 13:04:59 GMT
< Cache-Control: no-store
< Cache-Control: no-cache
< Cache-Control: must-revalidate
< Cache-Control: pre-check=0
< Cache-Control: post-check=0
< Cache-Control: max-age=0
< Pragma: no-cache
< Expires: Thu, 01 Jan 1970 00:00:00 GMT
< Content-Type: text/xml
< Transfer-Encoding: chunked
< Date: Mon, 01 Aug 2011 13:04:58 GMT
<
* Trying 153.46.254.70... * Closing connection #0
* Failure when receiving data from the peer
This is the Code:
// if curl is not installed we trigger an alert, and exit the function
if (!function_exists('curl_init')){
watchdog('sixtk_api', 'curl is not installed, api call cannot be executed',array(),WATCHDOG_ALERT);
return $this;
}
// OK cool - then let's create a new cURL resource handle
$ch = curl_init();
// Set URL to download
curl_setopt($ch, CURLOPT_URL, $this->request);
// Should cURL return or print out the data? (true = return, false = print)
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// Timeout in seconds
curl_setopt($ch, CURLOPT_TIMEOUT, 180);
// Download the given URL, and return output
$output = curl_exec($ch);
if (!$output) {
$error = curl_error($ch);
echo($error);
}
// Close the cURL resource, and free system resources
curl_close($ch);
try wget . if that fails too but you can access the address from another IP/device , this probably means your IP is being blocked or filtered out by either firewall/nginx anti ddos attack . try proxy .