I have the following php code. I am able to get the value stored in my variable and printed. But after json_encode, the value became 'empty'. Is there a bug in json_encode ? I am using Php 7.2 on ubuntu 18
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'https://www.thestar.com.my/rss/news/nation');
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$xml = curl_exec($ch);
curl_close($ch);
$rss = simplexml_load_string($xml);
$cnt = count($rss->channel->item);
$items = array();
$items[0]['title'] = $rss->channel->item[0]->title;
$items[0]['url'] = $rss->channel->item[0]->link;
echo $items[0]['title'] . PHP_EOL;
echo $items[0]['url'] . PHP_EOL;
echo json_encode($items) . PHP_EOL;
?>
See the output below. What is wrong with the json_encode ? Why after the json_encode, the title became empty ?
Sabah updates SOPs for places of worship and childcare centres
https://www.thestar.com.my/news/nation/2021/06/13/sabah-updates-sops-for-places-of-worship-and-childcare-centres
[{"title":{"0":{}},"url":{"0":"https:\/\/www.thestar.com.my\/news\/nation\/2021\/06\/13\/sabah-updates-sops-for-places-of-worship-and-childcare-centres"}}]
json_encode produces an empty object tag ({}) when passed PHP objects that do not implement JsonSerializable or contain public member variables.
You pass it \SimpleXMLElement nodes, not strings.
In your echo it just shows because the XML entities implement __toString() which echo invokes.
Either do that aswell or cast it to (string):
<?php
$rss = simplexml_load_string($xml);
$items = array();
$items[0]['title'] = $rss->channel->item[0]->title->__toString();
$items[0]['url'] = (string) $rss->channel->item[0]->link;
echo json_encode($items) . PHP_EOL;
?>
Thanks..Got it with
$items[0]['title'] = $rss->channel->item[0]->title->__toString();
Related
I appreciate the time you take to try and help me with my question.
So what i am doing is trying an html parser from a link. So I use curl first to link to the website then I convert it into htmlentities() so it doesn't load on the page so I get a string from that then i use the DOM object to extract the tag from. I checked different methods for a parser on google search so i learned a little bit about it then i execute my script but the problem is that the string is getting saved as textCont and not as a real html document so i would like to know how can convert htmlentities string into a real dom document and extract elements from it ?
the image of the var_dump is here
here is my script:
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, 'https://www.usatoday.com/story/news/world/2021/02/17/dubai-princess-sheikha-latifa-says-she-hostage-after-flee-attempt/6778014002/?utm_source=feedblitz&utm_medium=FeedBlitzRss&utm_campaign=usatodaycomworld-topstories');
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($curl);
curl_close($curl);
$htmlentities = htmlentities($result);
// I added the code here
$htmlDom = new DOMDocument();
$htmlDom->loadHTML($htmlentities);
$htmlDom->preserveWhiteSpace = false;
$styles = $htmlDom->getElementsByTagName('style');
foreach ($styles as $style) {
$item = $style->getElementsByTagName('td');
//echo the values
echo '1: '.$item->item(0)->nodeValue.'<br />';
echo '2: '.$item->item(1)->nodeValue.'<br />';
echo '3: '.$item->item(2)->nodeValue;
}
EDIT:
what i added next to the code is this:
$htmlentities = htmlentities($result);
$htmlentities = str_replace(""",'"', $htmlentities);
$htmlentities = str_replace("'","'", $htmlentities);
$htmlentities = str_replace("<","<", $htmlentities);
$htmlentities = str_replace(">",">", $htmlentities);
libxml_use_internal_errors(true);
$htmlDom = new DOMDocument();
$htmlDom->loadHTML($htmlentities);
libxml_clear_errors();
var_dump($htmlDom);
I tried to extract the download url from the webpage.
the code which tried is below
function getbinaryurl ($url)
{
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_FRESH_CONNECT, true);
$value1 = curl_exec($curl);
curl_close($curl);
$start = preg_quote('<script type="text/x-component">', '/');
$end = preg_quote('</script>', '/');
$rx = preg_match("/$start(.*?)$end/", $value1, $matches);
var_dump($matches);
}
$url = "https://www.sourcetreeapp.com/download-archives";
getbinaryurl($url);
this way i am getting the tags info not the content inside the script tag. how to get the info inside.
expected result is:
https://product-downloads.atlassian.com/software/sourcetree/ga/Sourcetree_4.0.1_234.zip,
https://product-downloads.atlassian.com/software/sourcetree/windows/ga/SourceTreeSetup-3.3.6.exe,
https://product-downloads.atlassian.com/software/sourcetree/windows/ga/SourcetreeEnterpriseSetup_3.3.6.msi
i am very much new in writing these regular expressions. can any help me pls.
Instead of using regex, using DOMDocument and XPath allows you to have more control of the elements you select.
Although XPath can be difficult (same as regex), this can look more intuitive to some. The code uses //script[#type="text/x-component"][contains(text(), "macURL")] which broken down is
//script = any script node
[#type="text/x-component"] = which has an attribute called type with
the specific value
[contains(text(), "macURL")] = who's text contains the string macURL
The query() method returns a list of matches, so loop over them. The content is JSON, so decode it and output the values...
function getbinaryurl ($url)
{
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_FRESH_CONNECT, true);
$value1 = curl_exec($curl);
curl_close($curl);
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($value1);
libxml_use_internal_errors(false);
$xp = new DOMXPath($doc);
$srcs = $xp->query('//script[#type="text/x-component"][contains(text(), "macURL")]');
foreach ( $srcs as $src ) {
$content = json_decode( $src->textContent, true);
echo $content['params']['macURL'] . PHP_EOL;
echo $content['params']['windowsURL'] . PHP_EOL;
echo $content['params']['enterpriseURL'] . PHP_EOL;
}
}
$url = "https://www.sourcetreeapp.com/download-archives";
getbinaryurl($url);
which outputs
https://product-downloads.atlassian.com/software/sourcetree/ga/Sourcetree_4.0.1_234.zip
https://product-downloads.atlassian.com/software/sourcetree/windows/ga/SourceTreeSetup-3.3.8.exe
https://product-downloads.atlassian.com/software/sourcetree/windows/ga/SourcetreeEnterpriseSetup_3.3.8.msi
So I'm having a problem with the following code.
I've got CURLOPT_RETURNTRANSFER set to true, yet nothing is returned when curl_exec is hit. Any and all help is appreciated!
<?php
$yql_base_url = "http://query.yahooapis.com/v1/public/yql?q=";
$yql_query = "select * from csv where url='http://download.finance.yahoo.com/d/quotes.csv?s=YHOO,GOOG,AAPL&f=sl1d1t1c1ohgv&e=.csv' and columns='symbol,price,date,time,change,col1,high,low,col2'";
$yql_params = "&format=json&diagnostics=true&callback=";
$yql_url = $yql_base_url . urlencode($yql_query) . $yql_params;
$session = curl_init($yql_url);
curl_setopt($session, CURLOPT_RETURNTRANSFER, true);
$json = curl_exec($session);
curl_close($session);
$phpObj = json_decode($json);
if(!is_null($phpObj->query->results))
{
echo $phpObj->query->results;
}
?>
$phpObj->query->results is an array of Object and you can not do echo on it. Simply use print_r() or var_dump() on it.
Example:
print_r($phpObj->query->results);
var_dump($phpObj->query->results);
I am trying to use the currentcy exchange rate feeds of the European Central Bank (ECB)
http://www.ecb.int/stats/eurofxref/eurofxref-daily.xml
They have provided documentation on how to parse the xml but none of the options works for me: I checked that allow_url_fopen=On is set.
http://www.ecb.int/stats/exchange/eurofxref/html/index.en.html
For instance, I used but it doesn't echo anything and it seems the $XML object is always empty.
<?php
//This is aPHP(5)script example on how eurofxref-daily.xml can be parsed
//Read eurofxref-daily.xml file in memory
//For the next command you will need the config option allow_url_fopen=On (default)
$XML=simplexml_load_file("http://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml");
//the file is updated daily between 2.15 p.m. and 3.00 p.m. CET
foreach($XML->Cube->Cube->Cube as $rate){
//Output the value of 1EUR for a currency code
echo '1€='.$rate["rate"].' '.$rate["currency"].'<br/>';
//--------------------------------------------------
//Here you can add your code for inserting
//$rate["rate"] and $rate["currency"] into your database
//--------------------------------------------------
}
?>
Update:
As I am behind proxy at my test environment, I tried this but still I don't get to read the XML:
function curl($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_close ($ch);
return curl_exec($ch); }
$address = urlencode($address);
$data = curl("http://www.ecb.int/stats/eurofxref/eurofxref-daily.xml");
$XML = simplexml_load_file($data);
var_dump($XML); -> returns boolean false
Please help me. Thanks!
I didn't find any relevant settings in php.ini. Check with phpinfo() if you have SimpleXML support and cURLsupport enabled. (You should have them both and especially SimpleXML since you're using it and it returns false, it doesn't complain about missing function.)
Proxy might be an issue here. See this and this answer. Using cURL could be an answer to your problem.
Here's one alternative foud here.
$url = file_get_contents('http://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml');
$xml = new SimpleXMLElement($url) ;
//file put contents - same as fopen, wrote and close
//need to output "asXML" - simple xml returns an object based upon the raw xml
file_put_contents(dirname(__FILE__)."/loc.xml", $xml->asXML());
foreach($xml->Cube->Cube->Cube as $rate){
echo '1€='.$rate["rate"].' '.$rate["currency"].'<br/>';
}
This solution works for me:
$data = [];
$url = "http://www.ecb.europa.eu/stats/eurofxref/eurofxref-hist-90d.xml";
$xmlRaw = file_get_contents($url);
$doc = new DOMDocument();
$doc->preserveWhiteSpace = FALSE;
$doc->loadXML($xmlRaw);
$node1 = $doc->getElementsByTagName('Cube')->item(0);
foreach ($node1->childNodes as $node2) {
$value = [];
foreach ($node2->childNodes as $node3) {
$value['date'] = $node2->getAttribute('time');
$value['currency'] = $node3->getAttribute('currency');
$value['rate'] = $node3->getAttribute('rate');
$data[] = $value;
unset($value);
}
}
echo "<pre"> . print_r($data) . "</pre>";
I have a php function that fetches and returns tweets data from twitter as simplexml object.I could get its contents by using php syntax. Here is php function
<?php
function searchResults($q) {
$host = "http://search.twitter.com/search.atom?q=" . urlencode( $q ) . "&rpp=100";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $host);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
//Raw xml
$result = curl_exec($ch);
curl_close($ch);
$xml = simplexml_load_string($result);
return $xml;
}
?>
If I call it like
$xml = searchResults('xyz');
I could fetch its contents like
echo $xml->content.''.$xml->author->name;
Now I need to return it from php function in JSON format. Like
return json_encode($xml);
in spite of
return $xml;
So how do now I get same 'content' and 'author->name' etc contents from it in JSON format when I decode json.
If you don't need the XML for any other reason than to return it as JSON, why not use json format as the response for the Twitter API call?
http://search.twitter.com/search.json?q=blablabla
This returns the response in a JSON string that you could just return.
I would rewrite your code like this:
<?php
function searchResults($q) {
$host = "http://search.twitter.com/search.json?q=" . urlencode( $q ) . "&rpp=100";
$raw_json = file_get_contents($host);
return $raw_json;
}
?>
It depends. If you're accessing it back from PHP, there are two ways:
$obj = json_decode($json);
echo $obj->author->name;
or
$arr = json_decode($json, true);
echo $arr['author']['name'];
If you are accessing it using JavaScript, it should be:
alert(jsObject.author.name);