I am trying to capture the first instance of particular elements from an object. I have an object $doc and would like to get the values of the following.
id, url, alias, description and label i.e. specifically:
variable1 - Q95,
variable2 - //www.wikidata.org/wiki/Q95,
variable3 - Google.Inc,
varialbe4 - American multinational Internet and technology corporation,
variable5 - Google
I've made some progress getting the $jsonArr string however I'm not sure this is the best way to go, and if so I'm not sure how to progress anyway.
Please advise as to the best way to get these. Please see my code below:
<HTML>
<body>
<form method="post">
Search: <input type="text" name="q" value="Google"/>
<input type="submit" value="Submit">
</form>
<?php
if (isset($_POST['q'])) {
$search = $_POST['q'];
$errors = libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTMLFile("https://www.wikidata.org/w/api.php?
action=wbsearchentities&search=$search&format=json&language=en");
libxml_clear_errors();
libxml_use_internal_errors($errors);
var_dump($doc);
echo "<p>";
$jsonArr = $doc->documentElement->nodeValue;
$jsonArr = (string)$jsonArr;
echo $jsonArr;
}
?>
</body>
</HTML>
Since the response to your API request is JSON, not HTML or XML, it's most appropriate to use cURL or Stream library to perform the HTTP request. You can even use something primitive like file_get_contents.
For example, using cURL:
// Make the request
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://www.wikidata.org/w/api.php?action=wbsearchentities&search=google&format=json&language=en");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
curl_close($ch);
// Decode the string into an appropriate PHP type
$contents = json_decode($output);
// Navigate the object
$contents->search[0]->id; // "Q95"
$contents->search[0]->url; // "//www.wikidata.org/wiki/Q95"
$contents->search[0]->aliases[0]; // "Google Inc."
You can use var_dump to inspect the $contents and traverse it like you would any PHP object.
Related
I'm trying to pass some data (JSON) to another page by scanning a QR code.
The page where the data is send to, contains a HTML form. I want to use that form as a last chance to correct the data before sending it to the database.
I found here at S.O. a way to pass the data using cURL: (https://stackoverflow.com/a/15643608/2131419)
QR code library:
http://phpqrcode.sourceforge.net
I use the QR code execute this function:
function passData () {
$url = 'check.php';
$data = array('name' => 'John', 'surname' => 'Doe');
$ch = curl_init( $url );
# Setup request to send json via POST.
$payload = json_encode($data);
curl_setopt( $ch, CURLOPT_POSTFIELDS, $payload );
curl_setopt( $ch, CURLOPT_HTTPHEADER, array('Content-Type:application/json'));
# Return response instead of printing.
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
# Send request.
$result = curl_exec($ch);
curl_exec($ch);
curl_close($ch);
# Print response.
return $result;
}
Create QR code:
QRcode::png(passData(), $tempDir.'007_4.png', QR_ECLEVEL_L, 4);
echo '<img src="'.$tempDir.'007_4.png" />';
Check.php
<?php $data = json_decode(file_get_contents("php://input"), true); ?>
<form method="post" action="handle.php">
<input type="text" name="name" value="<?php echo $data['name'];?>" /><br />
<input type="text" name="surname" value="<?php echo $data['surname'];?>" /><br />
<input type="submit" />
</form>
Problem:
I can pass the data to check.php, but it's returning plain text instead of a useable HTML form.
Hope someone can help!
EDIT
Some clarification:
What I actually want is, to scan the QR code, which executes the passData() function. Then the 'QR code scanner app', needs to open a browser, which shows check.php with the form AND the passed data as the values of the input fields.
Now, I get only the response of check.php (plain text).
When I pass an URL instead of the passData() function like:
QRcode::png("http://www.google.com", $tempDir.'007_4.png', QR_ECLEVEL_L, 4);
The app asks if I want to go to http://www.google.com.
QR codes cannot execute code. The only executable type of data you can put in a QR code is a URL. That is why using google.com as a URL opens a web browser to that URL. The QR code itself does not render anything.
What your code is doing is fetching the check.php page when the QR code is generated and then storing the output as the raw data. It isn't a webpage, it is a string like you are seeing in your question. You may be able to pass a javascript URL similar to a bookmarklet but its execution would depend on the QR code reader being used.
bookmarklet example
<?php
function passData() {
// javascript code in a heredoc, you may need to url encode it
return <<<JS
javascript:(function() {
//Statements returning a non-undefined type, e.g. assignments
})();
JS;
}
A better way to do it would be to have your QR code generate a URL like: http://your-site.com/check.php?name=John&surname=Doe and host check.php on your machine. You can use the $_GET data to populate your form and then use javascript to automatically post it as Jah mentioned.
Not the best way but you can do something like this.
Check.php:
<?php
$data = '<form method="post" action="handle.php">
<input type="text" name="name" value="name" /><br />
<input type="text" name="surname" value="surname" /><br />
<input type="submit" />
</form>';
$html = str_replace(PHP_EOL, ' ', $data);
$html = preg_replace('/[\r\n]+/', "\n", $html);
$html = preg_replace('/[ \t]+/', ' ', $html);
$html = str_replace('> <', '><', $html);
?>
<div id="placeholder">
Write HTML here
</div>
<script type="text/javascript">
function write_html(id,data){
var formHtml = data;
document.getElementById(id).innerHTML = formHtml;
}
</script>
I am try to learn data scrapping from other website so I started by trying creating a small HTML file.
domhtml.php :
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<body>
<div id="mango">
This is the mango div. It has some text and a form too.
<form>
<input type="text" name="first_name" value="Yahoo" />
<input type="text" name="last_name" value="Bingo" />
</form>
<table class="inner">
<tr><td>Happy</td><td>Sky</td></tr>
</table>
</div>
<table id="data" class="outer">
<tr><td>Happy1</td><td>Sky</td></tr>
<tr><td>Happy2</td><td>Sky</td></tr>
<tr><td>Happy3</td><td>Sky</td></tr>
<tr><td>Happy4</td><td>Sky</td></tr>
<tr><td>Happy5</td><td>Sky</td></tr>
</table>
</body>
</html>
extract.php :
<?php
$ch = curl_init("http://192.168.0.198/projects/domhtml.php");
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
$cl = curl_exec($ch);
$dom = new DOMDocument();
$dom->loadHTML($cl);
$dom->validate();
$title = $dom->getElementById("mango");
//var_dump($title);exit;
//$title = $dom->saveXML($title);
echo '<pre>';
print_r($title);
?>
But it returns output :
DOMElement Object
(
)
why it is empty ? What is to be done other then this ? I also tried PHP Dom not retrieving element solution but it return the same.
Edit :
Ok as you all guys told me I have done this :
$ch = curl_init("http://192.168.0.198/shopclues/domhtml.php");
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
$cl = curl_exec($ch);
$dom = new DOMDocument();
$dom->loadHTML($cl);
$dom->validate();
$title = $dom->getElementById("data");
//var_dump($title);exit;
$title = $dom->saveXML($title);
echo '<pre>';
print_r($title);
So now it is printing :
Happy1 Sky
Happy2 Sky
Happy3 Sky
Happy4 Sky
Happy5 Sky
I want to know the how many tr tag is there so that I can store the value of each tr in some variable. I mean how can I loop to store the value into variable ?
Thanks in advance.
The default "__toString()" functions in the DOM classes have been steadily improving:
http://codepad.viper-7.com/hw9UKg
Run the code in the snippet above using different versions of PHP, you'll see the difference between 5.3.3 and 5.4.33.
For the second part of your question, there are many ways to do what you want. I will show you one:
$dom = new DOMDocument();
// I used a different URL
$dom->loadHtmlFile("http://192.168.0.198/shopclues/domhtml.php");
$list = $dom->getElementById("data")->childNodes;
print_r($list->length); // outputs 5 for me.
$list is a DOMNodeList which implements Traversable so you can loop over it to get the values. For more information, check:
http://php.net/manual/en/class.domnodelist.php
For more complex queries, you may want to look into DOMXPath:
http://php.net/manual/en/class.domxpath.php
It would also be beneficial to read all the functions available to you with DomDocument and DomNode:
http://php.net/manual/en/class.domdocument.php
http://php.net/manual/en/class.domnode.php
I know this is maybe a very dummy question, but I'm facing a requirement with PHP, I've made some very simple things with it, but now I really need help.
I have this scenario:
I invoke a Java Rest WS using the following url:
http://192.168.3.41:8021/com.search.ws.module.ModuleSearch/getResults/jsonp?xmlQuery=%3C?xml%20version%3D'1.0'%20encoding%3D'UTF-8'?%3E%3Cquery%20ids%3D%2216535%22%3E%3CmatchWord%3Ehave%3C/matchWord%3E%3CfullText%3E%3C![CDATA[]]%3E%3C/fullText%3E%3CquotedText%3E%3C!...
But for this I had to use a Java util class to replace some special chars in the xml parameter, because the original xml is something like:
<?xml version='1.0' encoding='UTF-8'?><query ids="16914"><matchWord>avoir</matchWord><fullText><![CDATA[]]></fullText><quotedText><![CDATA[]]></quotedText><sensitivity></sensitivity><operator>AND</operator><offsetCooc>0</offsetCooc><cooc></cooc><collection>0</collection><searchOn>all</searchOn><nbResultDisplay>10</nbResultDisplay><nbResultatsParAspect>...
Now, I've been asked to create a PHP page in which I can set the XML as input and request it to the REST WS using a submit button. I made an approach but not seems to be working, here I paste my code:
<?php
if($_POST['btnSubmit'] == "Submit")
{
$crudXmlQuery = $_POST['inputXml'];
echo $crudXmlQuery;
echo "=================================================";
$xml = str_replace("%", "%25", $crudXmlQuery);
$xml = str_replace("&", "%26", $crudXmlQuery);
$xml = str_replace("=", "%3D", $crudXmlQuery);
echo $xml;
//$ch = curl_init($url);
//curl_setopt ($ch, CURLOPT_POST, 1);
//curl_setopt ($ch, CURLOPT_POSTFIELDS,'inputXml='.$xml);
//$info = curl_exec ($ch);
//curl_close ($ch);
}
?>
<form action="sampleIndex.php" method="post">
Please insert your XML Query
<input type='text' name='inputXml' value='<?=$crudXmlQuery?>'/>
<input type='submit' name='btnSubmit' value='Submit' />
</form>
I commented the part of the cURL since it was giving me some problems, I'm not sure how to handle this requirement yet, if somebody could help me please, I will really appreciate it. Thanks in advance. Best regards.
curl_setopt ($ch, CURLOPT_POSTFIELDS,'inputXml='.$xml);
this is not going to work, you shoud url encode $xml first. (using urlencode function)
The other part - not sure what exactly not working there :) But I do not see you taking the value of input field anywhere in your code:
$crudXmlQuery = $_POST['inputXml'];
I am using the following code for parsing dom document but at the end I get the error
"google.ac" is null or not an object
line 402
char 1
What I guess, line 402 contains tag and a lot of ";",
How can I fix this?
<?php
//$ch = curl_init("http://images.google.com/images?q=books&tbm=isch/");
// create a new cURL resource
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://images.google.com/images?q=books&tbm=isch/");
curl_setopt($ch, CURLOPT_HEADER, 0);
// grab URL and pass it to the browser
$data = curl_exec($ch);
curl_close($ch);
$dom = new DOMDocument();
$dom->loadHTML($data);
//#$dom->saveHTMLFile('newfolder/abc.html')
$dom->loadHTML('$data');
// find all ul
$list = $dom->getElementsByTagName('ul');
// get few list items
$rows = $list->item(30)->getElementsByTagName('li');
// get anchors from the table
$links = $list->item(30)->getElementsByTagName('a');
foreach ($links as $link) {
echo "<fieldset>";
$links = $link->getElementsByAttribute('imgurl');
$dom->saveXML($links);
}
?>
There are a few issues with the code:
You should add the CURL option - CURLOPT_RETURNTRANSFER - in order to capture the output. By default the output is displayed on the browser. Like this: curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);. In the code above, $data will always be TRUE or FALSE (http://www.php.net/manual/en/function.curl-exec.php)
$dom->loadHTML('$data'); is not correct and not required
The method of reading 'li' and 'a' tags might not be correct because $list->item(30) will always point to the 30th element
Anyways, coming to the fixes. I'm not sure if you checked the HTML returned by the CURL request but it seems different from what we discussed in the original post. In other words, the HTML returned by CURL does not contain the required <ul> and <li> elements. It instead contains <td> and <a> elements.
Add-on: I'm not very sure why do HTML for the same page is different when it is seen from the browser and when read from PHP. But here is a reasoning that I think might fit. The page uses JavaScript code that renders some HTML code dynamically on page load. This dynamic HTML can be seen when viewed from the browser but not from PHP. Hence, I assume the <ul> and <li> tags are dynamically generated. Anyways, that isn't of our concern for now.
Therefore, you should modify your code to parse the <a> elements and then read the image URLs. This code snippet might help:
<?php
$ch = curl_init(); // create a new cURL resource
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://images.google.com/images?q=books&tbm=isch/");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$data = curl_exec($ch); // grab URL and pass it to the browser
curl_close($ch);
$dom = new DOMDocument();
#$dom->loadHTML($data); // avoid warnings
$listA = $dom->getElementsByTagName('a'); // read all <a> elements
foreach ($listA as $itemA) { // loop through each <a> element
if ($itemA->hasAttribute('href')) { // check if it has an 'href' attribute
$href = $itemA->getAttribute('href'); // read the value of 'href'
if (preg_match('/^\/imgres\?/', $href)) { // check that 'href' should begin with "/imgres?"
$qryString = substr($href, strpos($href, '?') + 1);
parse_str($qryString, $arrHref); // read the query parameters from 'href' URI
echo '<br>' . $arrHref['imgurl'] . '<br>';
}
}
}
I hope above makes sense. But please note that the above parsing might fail if Google modifies their HTML.
How would I use Regex to get the information on a IP to Location API
This is the API
http://ipinfodb.com/ip_query.php?ip=74.125.45.100
I would need to get the Country Name, Region/State, and City.
I tried this:
$ip = $_SERVER["REMOTE_ADDR"];
$contents = #file_get_contents('http://ipinfodb.com/ip_query.php?ip=' . $ip . '');
$pattern = "/<CountryName>(.*)<CountryName>/";
preg_match($pattern, $contents, $regex);
$regex = !empty($regex[1]) ? $regex[1] : "FAIL";
echo $regex;
When I do echo $regex I always get FAIL how can I fix this
As Aaron has suggested. Best not to reinvent the wheel so try parsing it with simplexml_load_string()
// Init the CURL
$curl = curl_init();
// Setup the curl settings
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 0);
// grab the XML file
$raw_xml = curl_exec($curl);
curl_close($curl);
// Setup the xml object
$xml = simplexml_load_string( $raw_xml );
You can now access any part of the $xml variable as an object, with that in regard here is an example of what you posted.
<Response>
<Ip>74.125.45.100</Ip>
<Status>OK</Status>
<CountryCode>US</CountryCode>
<CountryName>United States</CountryName>
<RegionCode>06</RegionCode>
<RegionName>California</RegionName>
<City>Mountain View</City>
<ZipPostalCode>94043</ZipPostalCode>
<Latitude>37.4192</Latitude>
<Longitude>-122.057</Longitude>
<Timezone>0</Timezone>
<Gmtoffset>0</Gmtoffset>
<Dstoffset>0</Dstoffset>
</Response>
Now after you have loaded this XML string into the simplexml_load_string() you can access the response's IP address like so.
$xml->IP;
simplexml_load_string() will transform well formed XML files into an object that you can manipulate. The only other thing I can say is go and try it out and play with it
EDIT:
Source
http://www.php.net/manual/en/function.simplexml-load-string.php
You really are better off using a XML parser to pull the information.
For example, this script will parse it into an array.
Regex really shouldn't be used to parse HTML or XML.
If you really need to use regular expressions, then you should correct the one you are using. "|<CountryName>([^<]*)</CountryName>|i" would work better.