I need to find a way to detect whether a website (a joseki end point) is overloaded or not. http://128.250.202.125:7001/joseki/oracle is always up, but when I submit a query, sometimes it is idling. (i.e. overloaded, rather than it is down)
My approach so far is to simulate a form submission using curl. if curl_exec return false, I know the website is overloaded.
The major problem is that I am not sure whether website overloading triggers 'FALSE return' or not.
I can log the curl_exec's return using this method, but this the website going down.
<?php
$is_run = true;
if($is_run) {
$url = "http://128.250.202.125:7001/joseki/oracle";
$the_query = "
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
PREFIX ouext: <http://oracle.com/semtech/jena-adaptor/ext/user-def-function#>
PREFIX oext: <http://oracle.com/semtech/jena-adaptor/ext/function#>
PREFIX ORACLE_SEM_FS_NS: <http://oracle.com/semtech#timeout=100,qid=123>
SELECT ?sc ?c
WHERE
{ ?sc rdfs:subClassOf ?c}
";
// Simulate form submission.
$postdata = http_build_query(array('query' => $the_query));
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_POST, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, $postdata);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$tmp_condi = curl_exec($curl);
// After I submit a simulated form submission, and http://128.250.202.125:7001/joseki/oracle is
// not responding (i.g. idling), does it definitely returns FALSE????
if($tmp_condi === FALSE) {
die('not responding');
}
else {
}
curl_close($curl);
}
Solution
Able to solve it by adding the following, based on this: Setting Curl's Timeout in PHP
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT ,0);
curl_setopt($ch, CURLOPT_TIMEOUT, 400); //timeout in seconds
I need to find a way to detect whether a website is responding or not.
My approach so far is to simulate a form submission using curl.
I'd rather do HTTP HEAD request (see docs) and check return code. You do not need any data returned so no point of sending POST request or fetching response. I'd also set shorten timeout for the request:
$ch = curl_init();
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'HEAD');
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
$http_status = curl_getinfo($ch, CURLINFO_HTTP_CODE);
$content = curl_exec($ch);
curl_close($ch);
if $http_status is 200 (OK) then remote end can perhaps be considered live.
Yes, if the website doesnt respond for some time (set in CURLOPT_CONNECTIONTIMEOUT) it will trigger an error and curl_exec() will return false, in fact it will return false on any other error either, so your code will not actualy tell if the site is down or not.
Related
I'm building an application using a local php file that takes post/get input and returns JSON results. I'm doing this to de-couple front and backend operation on the idea that it's possible to move the backend elsewhere eventually (and it's neat because you can test backend operation using only browser and URL variables.
To be clear, I have no immediate or even long-term plans to actually separate them: right now they're on the same server in the same folder even - I just have a single backend.php file pretending to be a remote server so that I can practice decoupling. Victory for this issue means calling CURL and having the backend recieve the session, the backend can change/update/addto the session, and the front end sees all changes (basically ONE session for front and back).
The problem is that I'm constantly fighting to get session to work between the two. When I make AJAX requests with Javascript, session works fine because it's a page loading on the same server so session_start() just works. But when I CURL, the session data is not transferred.
I've been fighting with this for months so my curl function is pretty messy, but I can't figure out the magic combination that makes this work. No amount of SO questions or online guides I've been able to find work consistently in this case:
// Call the backend using the provided URL and series of name/value vars (in an array)
function backhand($data,$method='POST') {
$url = BACKEND_URL;
// Make sure the backend knows which session in the db to connect to
//$data['session_id'] = session_id();
// Backend will send this back in session so we can see the the connection still works.
//$data['session_test'] = rand();
$ch = curl_init();
if ('POST' == $method) {
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
}
$get_url = sprintf("%s?%s", $url, http_build_query($data));
$_SESSION['diag']['backend-stuff'][] = $get_url;
if ('GET' == $method) {
curl_setopt($ch, CURLOPT_PUT, 1);
$url = $get_url;
}
// Optional Authentication:
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
# curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
# curl_setopt($ch, CURLOPT_VERBOSE, 1);
# curl_setopt($ch, CURLOPT_USERAGENT,$_SERVER['HTTP_USER_AGENT']);
// Retrieving session ID
// $strCookie = 'PHPSESSID=' . $_COOKIE['PHPSESSID'] . '; path=/';
$cookieFile = "cookies.txt";
if(!file_exists($cookieFile)) {
$fh = fopen($cookieFile, "w");
fwrite($fh, $_SESSION);
fclose($fh);
}
#curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
#curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
#curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieFile); // Cookie aware
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieFile); // Cookie aware
// We pass the sessionid of the browser within the curl request
curl_setopt( $ch, CURLOPT_COOKIEFILE, $strCookie );
# session_write_close();
// Have to pause the session or the backend wipes the front
if (!$result = curl_exec($ch)) {
pre_r(curl_getinfo($ch));
echo 'Cerr: '.curl_error($ch);
}
curl_close($ch);
# session_start();
// "true" makes it return an array
return json_decode($result,true);
}
I call the function like so from the front-end to get results from the backend:
// Get user by email or ID
function get_user($emailorid) {
// If it's not an email, see if they've been cached. If so, return them
if (is_numeric($emailorid) && $_SESSION['users'][$emailorid])
return $_SESSION['users'][$emailorid];
return backhand(['get_user'=>$emailorid]);
}
So if I call "get_user" anywhere in the front, it will hop over to the back, run the db queries and dump it all to JSON which is returned to me in an associative arrays of values. This works fine, but session data doesn't persist properly and it's causing problems.
I even tried DB sessions for a while, but that wasn't consistent either. I'm running out of ideas and might have to build some kind of alternate session capability by using the db and custom functions, but I expect this CAN work... I just haven't figured out how yet.
You could keep the file system storage and share the file directory where are stored session with NFS if your backend web servers are on different servers.
You could also use http://www.php.net/manual/en/function.session-set-save-handler.php to set a different save handler for your session, but I am not sure that storing them on a database would be a good idea for I/O.
After brute-force trial and error, this seems to work:
function backhand($data,$method='POST') {
$url = BACKEND_URL;
// Make sure the backend knows which session in the db to connect to
//$data['session_id'] = session_id();
// Backend will send this back in session so we can see the the connection still works.
$data['session_test'] = rand();
$ch = curl_init();
if ('POST' == $method) {
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
}
$get_url = sprintf("%s?%s", $url, http_build_query($data));
$_SESSION['diag']['backend-stuff'][] = $get_url;
if ('GET' == $method) {
curl_setopt($ch, CURLOPT_HTTPGET, 1);
$url = $get_url;
}
// Optional Authentication:
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
// required or it writes the data directly into document instead of putting it in a var below
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
#curl_setopt($ch, CURLOPT_USERAGENT,$_SERVER['HTTP_USER_AGENT']);
// Retrieving session ID
$strCookie = 'PHPSESSID=' . $_COOKIE['PHPSESSID'] . '; path=/';
// We pass the sessionid of the browser within the curl request
curl_setopt( $ch, CURLOPT_COOKIE, $strCookie );
curl_setopt($ch, CURLOPT_COOKIEJAR, 'somefile');
curl_setopt($ch, CURLOPT_COOKIEFILE, APP_ROOT.'/cookie.txt');
curl_setopt($ch, CURLOPT_TIMEOUT_MS, 5000);
// Have to pause the session or the backend wipes the front
session_write_close();
if (!$result = curl_exec($ch)) {
pre_r(curl_getinfo($ch));
echo 'Cerr: '.curl_error($ch);
}
#session_start();
curl_close($ch);
// "true" makes it return an array
return json_decode($result,true);
}
I'm trying to do the bare minimum, just to get it working.
Here is my Google Script:
function doPost(e) {
return ContentService.createTextOutput(JSON.stringify(e.parameter));
}
Here is my PHP code:
$url = 'https://script.google.com/a/somedomain.com/macros/s/### script id ###/exec';
$data['name'] = "Joe";
$ch = curl_init();
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Content-type: multipart/form-data"));
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($data));
// curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$result = curl_exec($ch);
$error = curl_error($ch);
Executing this, $result is true.
If I uncomment the CURLOPT_RETURNTRANSFER line, $result =
<HTML>
<HEAD>
<TITLE>Bad Request</TITLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000">
<H1>Bad Request</H1>
<H2>Error 400</H2>
</BODY>
</HTML>
$error is always empty.
I would use doGet() but I need to send some rather large POSTs that will exceed what GET can handle.
How can I post to a Google script and return data?
------ UPDATE ------
I've just learned my lead developer tried this some time ago and concluded doPost() errors when returning so apparently it's not just me. My take is that Google is simply not reliable enough to use. I would love for someone to prove me wrong.
------ UPDATE 2 - THE FIX ---------
Apparently this was the problem:
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($data));
needs to be:
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
No idea why http_build_query() caused it to error.
Try reading the documentation for CURLOPT_POSTFIELDS and you'll see that is says To post a file, prepend a filename with # and use the full path. That looks what you are trying to do. Note that in php 5.5, the CURLFile class was introduced to let you post files.
If you are using php 5.5 or later, you might try this:
$url = 'https://script.google.com/a/somedomain.com/macros/s/### script id ###/exec';
// create a CURLFile object:
$cfile = new CURLFile('file.pdf','application/pdf'); // you can also optionally use a third parameter
// your POST data...you may need to add other data here like api keys and stuff
$data = array("fileName" => $cfile);
$ch = curl_init();
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Content-type: multipart/form-data"));
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, 1);
// FROM THE DOCS:
// If value is an array, the Content-Type header will be set to multipart/form-data (so you might skip the line above)
// As of PHP 5.2.0, value must be an array if files are passed to this option with the # prefix
// As of PHP 5.5.0, the # prefix is deprecated and files can be sent using CURLFile
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
// set this to TRUE if you want curl_exec to retrieve the result
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$result = curl_exec($ch);
if ($result === FALSE) {
echo "The result is FALSE. There was a problem\n";
$error = curl_error($ch);
var_dump($error);
die();
} else {
echo "success!\n";
var_dump($result);
}
// this can give you more information about your request
$info = curl_getinfo($ch);
if ($info === FALSE) {
echo "curlinfo is FALSE! Something weird happened";
}
var_dump($info); // examine this output for clues
EDIT: If you are not getting any error, and $result comes back with something like "Bad Request" then you will need to inspect the result more closely to find out what the problem is. A well-behaved API should have informative information to help you fix the problem. If the API doesn't tell you what you did wrong, you can examine the curlinfo you get from these commands:
$info = curl_getinfo($ch);
var_dump($info); // examine this output for clues
if $result and $info don't tell you what you've done wrong, try reading the API documentation more closely. You might find a clue in there somewhere.
If you can't figure out what the problem is using these tactics, there's not much else you can do with your code. You'll need more information from the maintainers of the API.
You need to look at your HTTP Request header to see what is actually being posted.
When trouble shooting I add these options:
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_TIMEOUT,10);
curl_setopt($ch, CURLOPT_FAILONERROR,true);
curl_setopt($ch, CURLOPT_ENCODING,"");
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
CURLINFO_HEADER_OUT will add "request_header" to curl_getinfo()
You also want to look at these curl_getinfo() elements.
request_size
size_upload
upload_content_length
request_header
I have a function:
public function getHeaders($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
$x = curl_exec($ch);
curl_close($ch);
return (array) HTTP::parse_header_string($x) ;
}
When $url=http://www.google.com', i have header location:http://www.google.de/?gfe_rd=cr&ei=SOMEHASHGOESHERE`
load it again and get all same but, 'SOMEHASHGOESHERE' is other now.
My task is to develop web-crawler. I know how to do basic logic of it. But there are few nuances. One of them are: What must do my spider if requested url send to it header 'location' and try to redirect? What model of behavior must control my spider to be impossible drop it into infinite redirect loop?
(how to identify similar urls like http://www.google.de/?gfe_rd=cr&ei=SOMEHASHGOESHERE which usually are using for loop redirection and give to my spider understanding to ignore such links )
If you are trying to just process the target of all redirections you can get curl to follow url's without returning redirection page.
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
if you are just interested in the base url without url parameters you can get it easily with explode:
$urlParts = explode("?",$url);
$baseUrl = $urlParts[0];
I'm trying to figure out why this won't work for me. I'm a complete noob when it comes to cURL, today is my first day using it. I followed a tutorial for this but obviously failed.
It should check the page and if it sees "Skill Stats" on there, then return "Success", and return "Failure" if it spots "Member Rankings".
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://services.runescape.com/m=hiscore/compare.ws?user1=Mercon185");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_HTTPGET, TRUE);
curl_setopt($ch, CURLOPT_POST, FALSE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec($ch);
if (stristr($output,"Skill Stats")) {
echo 'Success';
}
if (stristr($output,"Member Rankings")) {
echo 'Failure';
}
curl_close($ch);
?>
`
You need to enable follow redirects. As I see currently, your URL redirects to http://services.runescape.com/m=hiscore/overall.ws?errorcode=1. Without follow redirects enabled, it only fetches the first page, which indeed is empty.
The final landing page though, contains the data you want, so if you add this line to your cURL options, it should work:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
I am trying to update some custom fields using the REST API and PHP/cURL.
I'm wondering if I might have edited something without realizing it, while what I have below "worked" yesterday (I think), it does not work now.
I get varying responses using the different "methods", from:
I get this one using the POST method, as it is uncommented below.
HTTP 405 - The specified HTTP method is not allowed for the requested
resource ().
I get this one if I use the commented-out PUT method, with POST commented out.
{"status-code":500,"message":"Read timed out"}
And this one mixing and matching PUT and POST.
{"errorMessages":["No content to map to Object due to end of input"]}
What am I missing/doing wrong? I am using the following code:
<?php
$username = 'username';
$password = 'password';
$url = "https://example.com/rest/api/2/issue/PROJ-827";
$ch = curl_init();
$headers = array(
'Accept: application/json',
'Content-Type: application/json'
);
$test = "This is the content of the custom field.";
$data = <<<JSON
{
"fields": {
"customfield_11334" : ["$test"]
}
}
JSON;
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
// Also tried, with the above two lines commented out...
// curl_setopt($ch, CURLOPT_PUT, 1);
// curl_setopt($ch, CURLOPT_INFILE, $data);
// curl_setopt($ch, CURLOPT_INFILESIZE, strlen($data));
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERPWD, "$username:$password");
$result = curl_exec($ch);
$ch_error = curl_error($ch);
if ($ch_error) {
echo "cURL Error: $ch_error";
} else {
echo $result;
}
curl_close($ch);
?>
The problem here is that PHP's cURL API is not particularly intuitive.
You might think that because a POST request body is sent using the following option
that a PUT request would be done the same way:
// works for sending a POST request
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
// DOES NOT work to send a PUT request
curl_setopt($ch, CURLOPT_PUT, 1);
curl_setopt($ch, CURLOPT_PUTFIELDS, $data);
Instead, to send a PUT request (with associated body data), you need the following:
// The correct way to send a PUT request
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "PUT");
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
Note that even though you're sending a PUT request, you still have to use the CURLOPT_POSTFIELDS
option to send your PUT request body. It's a confusing and inconsistent process, but it's what you've
got if you want to use the PHP cURL bindings.
According to the relevant manual entrydocs, the CURLOPT_PUT option seems to only work for PUTting a file directly:
TRUE to HTTP PUT a file. The file to PUT must be set with CURLOPT_INFILE and CURLOPT_INFILESIZE.
A better option IMHO is to use a custom stream wrapper for HTTP client operations. This carries the
added benefit of not making your application reliant on the underlying libcurl library. Such an
implementation is beyond the scope of this question, though. Google is your friend if you're interested
in developing a stream wrapper solution.