I have been trying to use MongoLabs api to simplify my life, and for the most part it was working until I tried to push updates to the db using php and curl, anyway no dice. My code is similar to this:
$data_string = json_encode('{"user.userEmail": "USER_EMAIL", "user.pass":"USER_PASS"}');
try {
$ch = curl_init();
//need to create temp file to pass to curl to use PUT
$tempFile = fopen('php://temp/maxmemory:256000', 'w');
if (!$tempFile) {
die('could not open temp memory data');
}
fwrite($tempFile, $data_string);
fseek($tempFile, 0);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "PUT");
//curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
//curl_setopt($ch, CURLOPT_INFILE, $tempFile); // file pointer
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, DB_API_REQUEST_TIMEOUT);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Content-Type: application/json',
'Content-Length: ' . strlen($data_string),
)
);
$cache = curl_exec($ch);
curl_close($ch);
} catch (Exception $e) {
return FALSE;
}
My problem seems to be with MongoLab's api. The code bit works perfect except for the fact that labs tells me that the data I am passing is an 'Invalid object{ "user.firstName" :"Pablo","user.newsletter":"true"}: fields stored in the db can't have . in them.'. I have tried passing a file and using the postfields, but neither worked.
When I test it on firefox's Poster plugin the value work fine. If someone out there has a better understanding of MongoLabs stuff I would love some enlightenment. Thanks in advance!
You will need to remove the dots from your field names. You might try going to a schema like this:
{ "user": { "userEmail": "USER_EMAIL", "pass": "USER_PASS" } }
Unfortunately, MongoDB doesn't support using dots in field names. This is because its query language uses the dot as an operator to chain nested field names. If MongoDB were to allow dots in field names dotted queries would become ambiguous without some kind of escaping mechanism.
If this document were legal:
{ "bow.ties": "uncool", "bow": { "ties": "cool" } }
This query would be ambiguous:
{ "bow.ties": "cool" }
Not clear if the document would match or not. Did you mean the field "bow.ties" or the field "ties" nested within the value of field "bow"?
Here's a capture of a mongo shell session demonstrating these ideas.
% mongo
MongoDB shell version: 2.1.1
connecting to: test
> db.stuff.save({"bow.ties":"uncool"})
Wed Jul 18 11:17:59 uncaught exception: can't have . in field names [bow.ties]
> db.stuff.save({"bow":{"ties":"cool"}})
> db.stuff.find({"bow.ties":"cool"})
{ "_id" : ObjectId("5006ff3f1348197bacb458f7"), "bow" : { "ties" : "cool" } }
After sometime working with some other functionality of the project I realized my mistake, and ultimately the source of the confusion.
The curl PUT was intended to send modifier operations to MongoDB. I was sending all my data as JSON and was interrupting decoding it to use in PHP then re-encoding part of it to send back. So the orignal data received looks something like this:
{"userEmail":"p#g.com","pass":"****", "$oid":"5555", "$set":{"user.firstName":"Pablo","user.newsletter":"true"}}
The problem was that I was grabbing the value of "$set" object (in php) and reencoding only the value, {"user.firstName":"Pablo","user.newsletter":"true"} without the operator "$set" and was sending it giving the error. In this case the proper string to send would have been {"$set":{"user.firstName":"Pablo","user.newsletter":"true"}}
While this is a simple mistake I hope that the next time someone does something like this and gets an invalid object error that they are luck enough to find this.
Related
I am working with an API that is documented here: https://cutt.ly/BygHsPV
The documentation is a bit thin, but I am trying to understand it the best I can. There will not be a developer from the creator of the API available before the middle of next week, and I was hoping to get stuff done before that.
Basically what I am trying to do is update the consent of the customer. As far as I can understand from the documentation under API -> Customer I need to send info through PUT to /customers/{customerId}. That object has an array called "communicationChoices".
Going into Objects -> CustomerUpdate I find "communicationChoices" which is specified as "Type: list of CommunicationChoiceRequest". That object looks like this:
{
"choice": true,
"typeCode": ""
}
Doing my best do understand this, I have made this function:
function update_customer_consent() {
global $userPhone, $username, $password;
// Use phone number to get correct user
$url = 'https://apiurlredacted.com/api/v1/customers/' . $userPhone .'?customeridtype=MOBILE';
// Initiate cURL.
$ch = curl_init( $url );
// Specify the username and password using the CURLOPT_USERPWD option.
curl_setopt( $ch, CURLOPT_USERPWD, $username . ":" . $password );
// Tell cURL to return the output as a string instead
// of dumping it to the browser.
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
// Data to send
$data = [
"communicationChoices" => [
"communicationChoiceRequest" => [
"choice" => true,
"typeCode" => "SMS"
]
]
];
$json_payload = json_encode($data);
print_r($json_payload);
// Set other options
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json','Content-Length: ' . strlen($json_payload)));
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "PUT");
curl_setopt($ch, CURLOPT_POSTFIELDS, $json_payload);
// Execute the cURL request
$response = curl_exec($ch);
// Check for errors.
if( curl_errno( $ch ) ) :
// If an error occured, throw an Exception.
throw new Exception( curl_error( $ch ) );
endif;
if (!$response)
{
return false;
} else {
// Decode JSON
$obj = json_decode( $response );
}
print_r($response);
}
I understand that this is very hard to debug without knowing what is going on within the API and with limited documentation, but I figured asking here was worth a shot anyway.
Basically, $json_payload seems to be a perfectly fine JSON object. The response from the API however, is an error code that means unknown error. So I must be doing something wrong. Maybe someone has more experience with APIs and such documentation and can see what I should really be sending and how.
Any help or guidance will be highly appreciated!
before you test your code, you can use the form provided on the API Documentation.
when you navigate to API > Customers > /customers/{customerId} (GET), you will see a form on the right side of the page (scroll up). you need to provide the required values on the form then hit Submit button. you will surely get a valid data for communicationChoices based on the result from the Response Text section below the Submit button.
now, follow the data structure of communicationChoices object that you get from the result and try the same on API > Customers > /customers/{customerId} (PUT) form.
using the API forms, you may be able to instantly see a success or error from your input (data structure), then translate it to your code.
I am new to JSON data transfer. I want to make a user click on a link in a webpage and that should redirect the user to another page with his login credentials in the url and display it there. Now this all I want to send and receive through JSON . I am working on PHP environment. I am adding a short code on which I am working but not knowing how to proceed exactly.
send.php
<?php
$data = '{ "user" : [
{ "email" : "xyz#gmail.com",
"password" : "xyz#123",
"employee_id" : 77
}
]
} ';
$url_send ="http://localhost/cwmsbi/recieve.php";
$str_data = json_encode($data);
function sendPostData($url_send, $post){
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS,$post);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Content-Type: application/json',
'Content-Length: ' . strlen($post))
);
$result = curl_exec($ch);
curl_close($ch); // Seems like good practice
return $result;
}
echo " " . sendPostData($url_send, $str_data);
?>
And receive.php
<?php
$json_input_data=json_decode(file_get_contents('php://input'),TRUE);
print_r( $json_input_data);
?>
Now when I am running send.php on my localhost, it displays the data on same page but does not goes to recieve.php.
How this can be achieved? I am curious and in need of this too. How can I run a JSON file and where should i obtain results? Your guidance will be immensely useful to me right now.
First of all i see you are json encoding $data two times (as when it gets defines it is already a json string and then you do $str_data = json_encode($data);).
If you want to achive the change of location with post data too, you can't use curl
(POST data and redirect user by PHP CURL - read this question for further infos) - and i don't think you can do it by php only.
If i was trying to achive what you're trying to achive (and i would never make a page to show login password to users - as it is bad practice to show a password, even in emails), i suggest to set the json string into $_SESSION variable in send.php and redirect with header("Location: http://localhost/cwmsbi/recieve.php") where you get the json data from $_SESSION variable and you print it.
I did not make an example as i think this one perfectly suites you:
https://stackoverflow.com/a/42215249/9606459
Extra hint: even if placing the password in php $_SESSION variable is better than put it in post request, remember you are doing bad practice and at least remember to empty out that json string in $_SESSION variable after you print it.
e.g.:
unset($_SESSION['user_data']);
I've programmed a very basic web-scraping tool in PHP using cURL and DOM. I'm running it locally on a Windows 10 box using XAMPP (Apache & MySQL). It scrapes approximately 5 values on 400 pages (~2,000 values in total) on one specific website. The job typically completes in < 120 seconds, but intermittently (about once every 5 runs) it'll stop around the 60 second mark with the following error:
Recv failure: Connection was reset
Probably irrelevant, but all of my scraped data is being thrown into a MySQL table, and a separate .php file is styling the data and presenting it. This part is working fine. The error is being thrown by cURL. Here's my (very trimmed) code:
$html = file_get_html('http://IPAddressOfSiteIAmScraping/subpage/listofitems.html');
//Some code that creates my SQL table.
//Finds all subpages on the site - this part works like a charm.
foreach($html->find('a[href^=/subpage/]') as $uniqueItems){
//3 array variables defined here, which I didn't include in this example.
$path = $uniqueItems->href;
$url = 'http://IPAddressOfSiteIAmScraping' . $path;
//Here's the cURL part - I suspect this is the problem. I am an amateur!
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_URL, trim($url));
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0); //An attempt to fix it - didn't work.
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0); //An attempt to fix it - didn't work.
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 0);
curl_setopt($curl, CURLOPT_TIMEOUT, 1200); //Amount of time I let cURL execute for.
$page = curl_exec($curl);
//This is the part that throws up the connection reset error.
if(curl_errno($curl)) {
echo 'Scraping error: ' . curl_error($curl);
exit; }
curl_close($curl);
//Here we use DOM to begin collecting specific cURLed values we want in our SQL table.
$dom = new DOMDocument;
$dom->encoding = 'utf-8'; //Alows the DOM to display html entities for special characters like รถ.
#$dom->loadHTML(utf8_decode($page)); //Loads the HTML of the cURLed page.
$xpath = new DOMXpath($dom); //Allows us to use Xpath values.
//Xpaths that I've set - this is for the SQL part. Probably irrelevant.
$header = $xpath->query('(//div[#id="wrapper"]//p)[#class="header"][1]');
$price = $xpath->query('//tr[#class="price_tr"]/td[2]');
$currency = $xpath->query('//tr[#class="price_tr"]/td[3]');
$league = $xpath->query('//td[#class="left-column"]/p[1]');
//Here we collect specifically the item name from the DOM.
foreach($header as $e) {
$temp = new DOMDocument();
$temp->appendChild($temp->importNode($e,TRUE));
$val = $temp->saveHTML();
$val = strip_tags($val); //Removes the <p> tag from the data that goes into SQL.
$val = mb_convert_encoding($val, 'html-entities', 'utf-8'); //Allows the HTML entity for special characters to be handled.
$val = html_entity_decode($val); //Converts HTML entities for special characters to the actual character value.
$final = mysqli_real_escape_string($conn, trim($val)); //Defense against SQL injection attacks by canceling out single apostrophes in item names.
$item['title'] = $final; //Here's the item name, ready for the SQL table.
}
//Here's a bunch of code where I write to my SQL table. Again, this part works great!
}
I am not opposed to switching to regex if I need to ditch DOM, but I did three days worth of lurking before I chose DOM over regex. I have spent a lot of time researching this problem, but everything I'm seeing says "Recv failure: Connection was reset by peer", which is not what I am getting. I'm really frustrated that I have to ask for help - I've been doing so great so far - just learning as I go. This is the first thing I've ever written in PHP.
TL;DR: I wrote a cURL web-scraper that works brilliantly only 80% of the time. 20% of the time, for an unknown reason, it errors out with "Recv failure: Connection was reset".
Hopefully someone can help me!! :) Thanks for reading even if you can't!
P.S. if you'd like to see my FULL code, it's at: http://pastebin.com/vf4s0d5L.
After researching this at length (I'd already been researching it for days before posting my question), I've caved in and accepted that this error is probably tied to the site I'm trying to scrape and therefore out of my control.
I did manage to work around it though, so I'll drop my workaround here...
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_URL, trim($url));
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 0);
curl_setopt($curl, CURLOPT_TIMEOUT, 1200); //Amount of time I let cURL execute for.
$page = curl_exec($curl);
if(curl_errno($curl)) {
echo 'Scraping error: ' . curl_error($curl) . '</br>';
echo 'Dropping table...</br>';
$sql = "DROP TABLE table_item_info";
if (!mysqli_query($conn, $sql)) {
echo "Could not drop table: " . mysqli_error($conn);
}
mysqli_close($conn);
echo "TABLE has been dropped. Restarting.</br>";
goto start;
exit; }
curl_close($curl);
Basically, what I've done is implemented error-checking. If the error comes up under curl_errno($curl), I assume it's the connection reset error. That being the case, I drop my SQL table and then jump back to the start of my script using "goto start". Then, at the top of my file I have "start:"
This fixed my problem! Now I don't need to worry about whether the connection was reset or not. My code is smart enough to determine that on its own and reset the script if that was the case.
Hope this helps!
I'm programming in PHP.
An article I've found useful until now was mainly about how to CURL through one site with a lot of information, but what I really need is how to cURL on multiple sites with not so much information - a few lines, as a matter of fact!
Another part is, the article focus is mainly at storing it at the FTP server in a txt file, but I have loaded around 900 addresses into mysql, and want to load them from there, and enrich the table with the information stored in the links - Which I will provided beneath!
We have some open public libraries with addresses and information about these and an API.
Link to the main site:
The function I would like to use: http://dawa.aws.dk/adresser/autocomplete?q=
SQL Structure:
Data example: http://i.imgur.com/jP1J26U.jpg
fx this addresse: Dornen 2 6715 Esbjerg N (called AdrName in databasen).
http://dawa.aws.dk/adresser/autocomplete?q=Dornen%202%206715%20Esbjerg%20N
This will give me the following output (which I want to store in the AdrID in the database):
[
{
"tekst": "Dornen 2, Tarp, 6715 Esbjerg N",
"adresse": {
"id": "0a3f50b8-d085-32b8-e044-0003ba298018",
"href": "http://dawa.aws.dk/adresser/0a3f50b8-d085-32b8-e044-0003ba298018",
"vejnavn": "Dornen",
"husnr": "2",
"etage": null,
"dør": null,
"supplerendebynavn": "Tarp",
"postnr": "6715",
"postnrnavn": "Esbjerg N"
}
}
]
How to store it all in a blob, as seen in the SQL structure?
If you want to make a cURL request in php use this method
function curl_download($Url){
// is cURL installed yet?
if (!function_exists('curl_init')){
die('Sorry cURL is not installed!');
}
// OK cool - then let's create a new cURL resource handle
$ch = curl_init();
// Now set some options (most are optional)
// Set URL to download
curl_setopt($ch, CURLOPT_URL, $Url);
// Set a referer
curl_setopt($ch, CURLOPT_REFERER, "http://www.example.org/yay.htm");
// User agent
curl_setopt($ch, CURLOPT_USERAGENT, "MozillaXYZ/1.0");
// Include header in result? (0 = yes, 1 = no)
curl_setopt($ch, CURLOPT_HEADER, 0);
// Should cURL return or print out the data? (true = return, false = print)
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// Timeout in seconds
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
// Download the given URL, and return output
$output = curl_exec($ch);
// Close the cURL resource, and free system resources
curl_close($ch);
return $output;
}
And then you call it using
print curl_download('http://dawa.aws.dk/adresser/autocomplete?q=Melvej');
Or you can directly convert it jSON object
$jsonString=curl_download('http://dawa.aws.dk/adresser/autocomplete?q=Melvej');
var_dump(json_decode($jsonString));
The data you download is json, so you can store that in a varchar column rather than blog.
Also the site with the api does not seem bothered about http referrer, user agent etc so you can use file_get_contents in place of curl.
So simply get all the results from your db, iterate over them, making a call to the api, and update the appropriate row with the correct data:
//get all the rows from your database
$addresses = DB::exec('SELECT * FROM addresses'); //i dont know how you actually access your db, this is just an example
foreach($addresses as $address){
$searchTerm = $address['AdrName'];
$addressId = $address['Vid'];
//download the json
$apidata = file_get_contents('http://dawa.aws.dk/adresser/autocomplete?q=' . urlencode($searchTerm));
//save back to db
DB::exec('UPDATE addresses SET status=? WHERE id=?', [$apidata, $searchTerm]);
//if you want to access the data, you can use json_decode:
$data = json_decode($apidata);
echo $data[0]->tekst; //outputs Dornen 2, Tarp, 6715 Esbjerg N
}
I'm trying to find a way to only quickly access a file and then disconnect immediately.
So I've decided to use cURL since it's the fastest option for me. But I can't figure out how I should "disconnect" cURL.
With the code below, Apache's access logs says that the file I tried accessing was indeed accessed, but I'm feeling a little iffy about this, because when I just run the while loop without breaking out of it, it just keeps looping. Shouldn't the loop stop when cURL has finished fetching the file? Or am I just being silly; is the loop just restarting constantly?
<?php
$Resource = curl_init();
curl_setopt($Resource, CURLOPT_URL, '...');
curl_setopt($Resource, CURLOPT_HEADER, 0);
curl_setopt($Resource, CURLOPT_USERAGENT, '...');
while(curl_exec($Resource)){
break;
}
curl_close($Resource);
?>
I tried setting the CURLOPT_CONNECTTIMEOUT_MS / CURLOPT_CONNECTTIMEOUT options to very small values, but it didn't help in this case.
Is there a more "proper" way of doing this?
This statement is superflous:
while(curl_exec($Resource)){
break;
}
Instead just keep the return value for future reference:
$result = curl_exec($Resource);
The while loop does not help anything. So now to your question: You can tell curl that it should only take some bytes from the body and then quit. That can be achieved by reducing the CURLOPT_BUFFERSIZE to a small value and by using a callback function to tell curl it should stop:
$withCallback = array(
CURLOPT_BUFFERSIZE => 20, # ~ value of bytes you'd like to get
CURLOPT_WRITEFUNCTION => function($handle, $data) {
echo "WRITE: (", strlen($data), ") $data\n";
return 0;
},
);
$handle = curl_init("http://stackoverflow.com/");
curl_setopt_array($handle, $withCallback);
curl_exec($handle);
curl_close($handle);
Output:
WRITE: (10) <!DOCTYPE
Another alternative is to make a HEAD request by using CURLOPT_NOBODY which will never fetch the body. But it's not a GET request.
The connect timeout settings are about how long it will take until the connect times out. The connect is the phase until the server accepts input from curl and curl starts to know about that the server does. It's not related to the phase when curl fetches data from the server, that's
CURLOPT_TIMEOUT The maximum number of seconds to allow cURL functions to execute.
You find a long list of available options in the PHP Manual: curl_setoptDocs.
Perhaps that might be helpful?
$GLOBALS["dataread"] = 0;
define("MAX_DATA", 3000); // how many bytes should be read?
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.php.net/");
curl_setopt($ch, CURLOPT_WRITEFUNCTION, "handlewrite");
curl_exec($ch);
curl_close($ch);
function handlewrite($ch, $data)
{
$GLOBALS["dataread"] += strlen($data);
echo "READ " . strlen($data) . " bytes\n";
if ($GLOBALS["dataread"] > MAX_DATA) {
return 0;
}
return strlen($data);
}