[![enter image description here][1]][1]I am trying to get the some tag value but it's showing some error.
Below is the code, please suggest some solution.
This is the method i used for httpGet request.
function httpGet($result15)
{
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,$result15);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
$output=curl_exec($ch);
curl_close($ch);
return $output;
}
$result15= httpGet("https://www.googleapis.com/customsearch/v1?key=API_KEY&cx=003255er&q=cancer&num=1&alt=atom");//new cse
echo $result15;
$xml = new DOMDocument();
$xml->loadXML($result15);
foreach( $xml->entry as $entry )
{
echo "URL=".(string)$entry->id.PHP_EOL;
echo "Summary=".(string)$entry->summary.PHP_EOL;
}
You might find the curl request is failing. You need to do a couple of things...
function httpGet($result15)
{
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,$result15);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // Add this
$output=curl_exec($ch);
// If this fails, output error.
if ($output === FALSE) {
echo curl_error($ch);
// Not sure what you want to do, but 'exit' will work for now
exit();
}
curl_close($ch);
return $output;
}
This will display an error if the curl request fails. You will need to decide how your going to cope with this. You could return false, and then in your code further down, check this before trying to load it as XML. The code above just stops on errors.
Your next piece of code seems to mix SimpleXML and DOMDocument, you can use SimpleXML if the document structure is fairly straight forward...
$xml = simplexml_load_string($result15);
foreach( $xml->entry as $entry )
{
Related
The issue:
I'm working with PHP, cURL and a public API to fetch json strings.
These json strings are formatted like this (simplified, average size is around 50-60 kB):
{
"data": {},
"previous": "url",
"next": "url"
}
What am trying to do is fetch all the json strings starting from the first one by checking for the "next" attribute. So I have a while loop and as long as there's a "next" attribute, I fetch the next URL.
The problem is sometimes, randomly the loop stops before the end and I cannot figure out why after many tests.
I say randomly because sometimes the loop goes through to the end and no problem occurs. Sometimes it crashes after N loops.
And so far I couldn't extract any information to help me debug it.
I'm using PHP 7.3.0 and launching my code from the CLI.
What I tried so far:
Check the headers:
No headers are returned. Nothing at all.
Use curl_errno() and curl_error():
I tried the following code right after executing the request (curl_exec($ch)) but it never triggers.
if(curl_errno($ch)) {
echo 'curl error ' . curl_error($ch) . PHP_EOL;
echo 'response received from curl error :' . PHP_EOL;
var_dump($response); // the json string I should get from the server.
}
Check if the response came back null:
if(is_null($response))
or if my json string has an error:
if(!json_last_error() == JSON_ERROR_NONE)
Though I think it's useless because it will never be valid if the cURL response is null or empty. When this code triggers, the json error code is 3 (JSON_ERROR_CTRL_CHAR)
The problematic code:
function apiCall($url) {
...
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
}
$inc = 0;
$url = 'https://api.example.com/' . $id;
$jsonString = apiCall($url);
if(!is_null($jsonString)) {
file_put_contents('pathToDirectory/' . $id + $inc, $jsonString);
$nextUrl = getNextUrl($jsonString);
while ($nextUrl) {
$jsonString = apiCall($url . '?page=' . $nextUrl);
if(!is_null($jsonString)) {
$inc++;
file_put_contents('pathToDirectory/' . $id + $inc, $jsonString);
$nextUrl = getNextUrl($jsonString);
}
}
}
What I'm expecting my code to do:
Not stop randomly, or at least give me a clear error code.
The problem is that your API could be returning an empty response, a malformed JSON, or even a status code different of 200 and you would stop execution imediately.
Since you do not control the API responses, you know that it can fail randomly, and you do not have access to the API server logs (because you don't, do you?); you need to build some kind of resilience in your consumer.
Something very simple (you'd need to tune it up) could be
function apiCall( $url, $attempts = 3 ) {
// ..., including setting "$headers"
$ch = curl_init();
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt( $ch, CURLOPT_HTTPHEADER, $headers );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
for ( $i = 0; $i < $attempts; $i ++ ) {
$response = curl_exec( $ch );
$curl_info = curl_getinfo( $ch );
if ( curl_errno( $ch ) ) {
// log your error & try again
continue;
}
// I'm only accepting 200 as a status code. Check with your API if there could be other posssible "good" responses
if ( $curl_info['http_code'] != 200 ) {
// log your error & try again
continue;
}
// everything seems fine, but the response is empty? not good.
if ( empty( $response ) ) {
// log your error & and try again
continue;
}
return $response;
}
return null;
}
This would allow you to do something like (pulled from your code):
do {
$jsonString = apiCall($url . '?page=' . $nextUrl);
$nextUrl = false;
if(!is_null($jsonString)) {
$inc++;
file_put_contents('pathToDirectory/' . $id + $inc, $jsonString);
$nextUrl = getNextUrl($jsonString);
}
}
while ($nextUrl);
I'm not checking if the return from the API is non-empty, not a connection error, a status different from '200' and yet an invalid JSON.
You may want to check for these things as well, depending on how brittle the API you are consuming is.
When attempting to create a new instance of SimpleXMLElement from a URL, I get the error Uncaught exception 'Exception' with message 'String could not be parsed as XML'.
Details:
$feed_url = 'https://www.ua.edu/news/feed/';
$the_feed = new SimpleXMLElement( $feed_url, LIBXML_NOCDATA, true );
When executing the code above, the error appears on both the Stage and Prod environments but not in Dev. I have compared the xml related settings between the Stage and Dev environments and only slight version differences exist (Dev is a SLIGHTLY older version of PHP than Stage/Prod).
$feed_url = 'http://hiphopdx.com/rss/news.xml';
$the_feed = new SimpleXMLElement( $feed_url, LIBXML_NOCDATA, true );
In Stage (Can't test this part in Prod), I changed the $feed_url variable to another feed's URL. Everything works as expected. A SimpleXMLElement object is created and can be dumped to the screen.
I have no clue how to proceed to correct this. Any help is much appreciated.
I checked all of the server settings that were referenced in similar questions and they were all properly set. I never did figure out why it wasn't pulling the XML from the file. The solution that I ended up using was to use curl to get the XML from the URL and SimpleXMLELEMENT to handle the raw XML.
function produce_XML_object_tree($raw_XML) {
libxml_use_internal_errors(true);
try {
$xmlTree = new SimpleXMLElement($raw_XML);
} catch (Exception $e) {
// Something went wrong.
$error_message = 'SimpleXMLElement threw an exception.';
foreach(libxml_get_errors() as $error_line) {
$error_message .= "\t" . $error_line->message;
}
trigger_error($error_message);
return false;
}
return $xmlTree;
}
$feed_url = 'http://feed.url';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $feed_url);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$xml = curl_exec($ch);
curl_close($ch);
$the_feed = produce_XML_object_tree($xml)
I want to get the whole element <article> which represents 1 listing but it doesn't work. Can someone help me please?
containing the image + title + it's link + description
<?php
$url = 'http://www.polkmugshot.com/';
$content = file_get_contents($url);
$first_step = explode( '<article>' , $content );
$second_step = explode("</article>" , $first_step[3] );
echo $second_step[0];
?>
You should definitely be using curl for this type of requests.
function curl_download($url){
// is cURL installed?
if (!function_exists('curl_init')){
die('cURL is not installed!');
}
$ch = curl_init();
// URL to download
curl_setopt($ch, CURLOPT_URL, $url);
// User agent
curl_setopt($ch, CURLOPT_USERAGENT, "Set your user agent here...");
// Include header in result? (0 = yes, 1 = no)
curl_setopt($ch, CURLOPT_HEADER, 0);
// Should cURL return or print out the data? (true = retu rn, false = print)
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// Timeout in seconds
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
// Download the given URL, and return output
$output = curl_exec($ch);
// Close the cURL resource, and free system resources
curl_close($ch);
return $output;
}
for best results for your question. Combine it with HTML Dom Parser
use it like:
// Find all images
foreach($output->find('img') as $element)
echo $element->src . '<br>';
// Find all links
foreach($output->find('a') as $element)
echo $element->href . '<br>';
Good Luck!
I'm not sure I get you right, But I guess you need a PHP DOM Parser. I suggest this one (This is a great PHP library to parser HTML codes)
Also you can get whole HTML code like this:
$url = 'http://www.polkmugshot.com/';
$html = file_get_html($url);
echo $html;
Probably a better way would be to parse the document and run some xpath queries over it afterwards, like so:
$url = 'http://www.polkmugshot.com/';
$xml = simplexml_load_file($url);
$articles = $xml->xpath("//articles");
foreach ($articles as $article) {
// do sth. useful here
}
Read about SimpleXML here.
extract the articles with DOMDocument. working example:
<?php
$url = 'http://www.polkmugshot.com/';
$content = file_get_contents($url);
$domd=#DOMDocument::loadHTML($content);
foreach($domd->getElementsByTagName("article") as $article){
var_dump($domd->saveHTML($article));
}
and as pointed out by #Guns , you'd better use curl, for several reasons:
1: file_get_contents will fail if allow_url_fopen is not set to true in php.ini
2: until php 5.5.0 (somewhere around there), file_get_contents kept reading from the connection until the connection was actually closed, which for many servers can be many seconds after all content is sent, while curl will only read until it reaches content-length HTTP header, which makes for much faster transfers (luckily this was fixed)
3: curl supports gzip and deflate compressed transfers, which again, makes for much faster transfer (when content is compressible, such as html), while file_get_contents will always transfer plain
EDIT:What is really happening is that a new xml is created each time but it is adding the new $html information to the previous so by the time it gets to the last element in the list being curled, it is saving parsed information from all previous curls. Can't figure out what is wrong.
Having trouble with a curl not executing as expected. In the code below I have a foreach loop that loops thru a list ($textarray) and passes the list element to a curl and also used to create an xml file using the element as the file name. The curl then returns $html which is then parsed and saved to an xml. The script runs, the list is passed, the url is created and passed to the curl function. I get an echo showing the correct url, a return is made and then each return is parsed and saved to the appropriate file. The problem seems to be that the curl is not actually curling the new $url. I get the exact same information saved in every xml file. I no this is not correct. Not sure why this is happening. Any help appreciated.
Function FeedXml($textarray){
$doc=new DOMDocument('1.0', 'UTF-8');
$feed=$doc->createElement("feed");
Foreach ($textarray as $text){
$url="http://xxx/xxx/".$text;
echo "PATH TO CURL".$url."<br>";
$html=curlurl($url);
$xmlsave="http://xxxx/xxx/".$text;
$dom = new DOMDocument(); //NEW dom FOR EACH SHOW
libxml_use_internal_errors(true);
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$dom->formatOutput = true;
$dom->preserveWhiteSpace = true;
//PARSE EACH RETURN INFORMATION
$images= $dom->getElementsByTagName('img');
foreach($images as $img){
$icon= $img ->getAttribute('src');
if( preg_match('/\.(jpg|jpeg|gif)(?:[\?\#].*)?$/i', $icon) ) {
// ITEM TAG
$item= $doc->createElement("item");
$sdAttribute = $doc->createAttribute("sdImage");
$sdAttribute->value = $icon;
$item->appendChild($sdAttribute);
} // IMAGAGE FOR EACH
$feed->appendChild($item);
$doc->appendChild($feed);
$doc->save($xmlsave);
}
}
}
Function curlurl($url){
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch,CURLOPT_FRESH_CONNECT, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_VERBOSE, 1);//0-FALSE 1 TRUE
curl_setopt($ch,CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch,CURLOPT_SSL_VERIFYPEER ,FALSE);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_TIMEOUT,'10');
$html = curl_exec($ch);
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
echo $httpcode;
return $html;
}
Thanks for pointing out my shortcomings on the above. I have figured out the problem. The following needed to be moved into the Foreach.
$doc=new DOMDocument('1.0', 'UTF-8');
$feed=$doc->createElement("feed");
I am trying to display the data of a xml parsed page which i get from a external source. which i got passing through some parameters like this:-
http://www.somewebsite.com/phpfile.php?vendor_key=xxx&checkin=2012-11-02&checkout=2012-11-05&city_id=5&guests=3
when i pass this parameters i got an xml result. now i want to display that xml data in a designer way on my webpage. so how can i do so. i am new to xml so dont know what this technology called if any body can tell me what this called so that can also help me.
Take a look at simplexml_load_string.
You can use curl or file_get_contents function to make HTTP request. Then after you can use DOM or SimpleXML to parse the response (XML) of requested URL.
If u have already XMl then try
echo $xml->asXML();
A full example
<?php
$curl = curl_init();
curl_setopt ($curl, CURLOPT_URL, 'http://rss.news.yahoo.com/rss/topstories');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec ($curl);
if ($result === false) {
die('Error fetching data: ' . curl_error($curl));
}
curl_close ($curl);
//we can at this point echo the XML if you want
//echo $result;
//parse xml string into SimpleXML objects
$xml = simplexml_load_string($result);
if ($xml === false) {
die('Error parsing XML');
}
//now we can loop through the xml structure
foreach ($xml->channel->item as $item) {
print $item->title;
}