How to get elements with custom name from rss?

How to get elements with custom name from rss? - php

I have the following rss format, and i can't eject the 'content:encoded' value.
<item>
<title>some title</title>
<link>some link</link>
<pubDate>Sat, 07 Apr 2012 5:07:00 -0700</pubDate>
<content:encoded><![CDATA[this value]]></content:encoded>
</item>
i wrote this function, everything works well except the 'content:encoded' field that give me this error: 'Notice: Trying to get property of non-object'
function rssReader($url) {
$doc = new DOMDocument();
$doc->load($url);
$fields = array('title', 'description', 'link', 'pubDate', 'content:encoded');
$nodes = array();
foreach ($doc->getElementsByTagName('item') as $node) {
$item = array();
var_export($node, true);
foreach ($fields as $field)
$item[$field] = $node->getElementsByTagName($field)->item(0)->nodeValue;
$nodes[] = $item;
}
return $nodes;
}

You need to use getElementsByTagNameNS instead of getElementsByTagName for 'content:encoded' tag:
foreach ($fields as $field){
if( $field == 'content:encoded' ){
$item[$field] = $node->getElementsByTagNameNS('contentNamespaceURI','encoded')->item(0)->nodeValue;
}else{
$item[$field] = $node->getElementsByTagName($field)->item(0)->nodeValue;
}
}
You could find 'contentNamespaceURI' in rss. There must be something like:
xmlns:content="contentNamespaceURI"

The Tag name here is "encoded".
Just use
$content => $node->getElementsByTagName('encoded')->item(0)->nodeValue

Related

How to retrieve an attribute from a namespaced element

I am trying to get the url attributes from the <media:content> elements in this RSS feed:
https://news.google.com/rss/search?q=test&as_qdr=w1&scoring=n&num=100&hl=en-CA&gl=CA&ceid=CA:en
Here's what I have so far:
$feed_url = "https://news.google.com/rss/search?q=test&as_qdr=w1&scoring=n&num=100&hl=en-CA&gl=CA&ceid=CA:en";
$rss = file_get_contents($feed_url);
$rss = new SimpleXMLElement($rss);
$items = $rss->channel->item;
foreach ($items as $item) {
print_r($item);
echo "<hr>";
}
This code works for all elements except the ones with a semicolon in the name, like <media:content> or <dc:contributor>. If I open the XML feed in my browser I can see the tag I am looking for:
<media:content url="https://lh6.googleusercontent.com/proxy/nxX8kqpFKSDvYg_bf_QrdsS0PYNMFPGspYmTlZlIo0IzyyhYhURxQc5nrpnzfrNBZkWQywioGXdPclazSIEwiz5wklsBePHOCft9qdHl2EmqIES_SMl5orim2xM2eHYalvIgFFeGYvp7cQaCQpKAObhPGQ--diqZg4Io3MSW8f6PXlRAbUcPvpDxB-KRqBj53bbROhoUYuqxkA=-w150-h150-c" medium="image" width="150" height="150"/>
</item>
I tried various solutions from other threads but it didn't work for me. Example:
$xml_object = $rss->channel->item[0];
$ns_media = $xml_object->children('http://search.yahoo.com/mrss/');
I don't know what I'm doing wrong so I would appreciate some help

You're missing a call to the attributes() method of your SimpleXMLElement instance:
foreach ($items as $item) {
$media_content_url = $item->children('http://search.yahoo.com/mrss/')->attributes()->url;
// ...
}

I'm not familiar with SimpleXML, but this is simple enough to do with DomDocument:
$feed_url = "https://news.google.com/rss/search?q=test&as_qdr=w1&scoring=n&num=100&hl=en-CA&gl=CA&ceid=CA:en";
$rss = file_get_contents($feed_url);
$dom = new DomDocument;
$dom->loadXML($rss);
$nodes = $dom->getElementsByTagNameNS("http://search.yahoo.com/mrss/", "content");
foreach ($nodes as $node) {
printf("%s\n", $node->getAttribute("url"));
}
Output:
https://lh6.googleusercontent.com/proxy/m7yanlDdWIjGc1XmsY6AHB5DqqJcgSe1Z7vs9DUC5NbD-FfQqJzEY8uIadNckLJFu7O6rcuh4W-CsXRg2vjr_KLOWhwNG5shhfdetcUkY5dMHa0uN1GBC5iY0svkP-Wxcm7JJ_kJMh6sctcvJ5Hfbb2Vor8KPlnYXUk_Y3jxYeCgmDBTqeRKwQ1pTMtWtJ_7fK5P5PSdKQKjUNnfVODZjHg_c4PwFWw3Cw=-w150-h150-c
https://lh3.googleusercontent.com/proxy/j7vDbXvscxGVLF8xo2DGkEgmgyQ9-u5vE0RWJjmAp84xOuy4v-Ff6cHADsLiC2Zd2KE7s04sCgtT_WNx4K5vxjDw_jbFRwQhlBgpL-YdXMgvDgakxzx8xWDO5bdpHaVssEGXgkxCnXnHXBRgb67vXeY6XnbgeEp7Fe5ohK1fpyk_hE3IYGyHdJnTxiH_=-w150-h150-c
https://lh6.googleusercontent.com/proxy/nxX8kqpFKSDvYg_bf_QrdsS0PYNMFPGspYmTlZlIo0IzyyhYhURxQc5nrpnzfrNBZkWQywioGXdPclazSIEwiz5wklsBePHOCft9qdHl2EmqIES_SMl5orim2xM2eHYalvIgFFeGYvp7cQaCQpKAObhPGQ--diqZg4Io3MSW8f6PXlRAbUcPvpDxB-KRqBj53bbROhoUYuqxkA=-w150-h150-c
https://lh3.googleusercontent.com/proxy/PbDyKTNQAyxkLNnyQFm00dHkNyKoASc3zKJjw7tjRtfmebHfbP_Ov_5RfcsG1RL8gyFaMSvVltd7IQJns6x_N_thPQTWz7E3ER0RlqLhZBYjmM-cp9xUkCdICiFyfkY0XGx-xGSh6zq5C_SpuzAxCVdhoOkqW4Lz_kyw-KN1fUJB6b8VgDGFvssIfmurSm3qCdJYeFJAx7x6lh_NQS1GNeNdbVBf0RoE2jiZfK6SYgFCX2s9KifQ7Sld_0plNrvTyW6VSR9D0AEwlBClWXNfoMmB_NYl4j03ELoUIjB0fRpUxV7YAqiIC1nSxqn92Q=-w150-h150-c
https://lh6.googleusercontent.com/proxy/DcJhP_BX_r7IbiwFgYt7MOL8RKQMizjCEWAn4YBcWdDy-PYOncpb_PrDZ0H5cMlxSFk9X8SQz5WEAX1xJRV4RWBiPwSH6uDJr7bt1Dh4H7MyYAaB_66BnASNA-fw0pszLPYEgfkwZsRyEZNT045MRYXA3Q=-w150-h150-c
https://lh5.googleusercontent.com/proxy/ECDRkzc4AO3OP1V0PNEVw1OhBYwuDRJVdDzF9lFNW34D8aNO8s54aWJfuR_LhDz-wKCRRvS64ggZnsg2UkCE5EYJghnBkQlpmwktgFYcKW0OnXP_-Ynh37EHR9nG9lyRoM1-3ebj=-w150-h150-c
https://lh6.googleusercontent.com/proxy/9xL1YVQyHg2mzitNgeiHRkjq_vJBxmOb0iAb1bBfJcqFLlWOJWkzRmLYrc9-hmK4nGcLnFfLMEb-bTnZmlWRM72_9ibysFUU_z-77ZK5PhX-f8vfIoWp=-w150-h150-c
https://lh5.googleusercontent.com/proxy/FcDG9_8xHTVUP2uHZ6cMAnIAdDxd-Kg29IksHUEDJCX8_mjTd2voG8BITnqpPEXKtFEImDogPfJfHNlqJr7X2I0VHlkesJvRnE2D-aLRxiNJfc2Lh6WIb0PrRy2nluPe6IJOhUulh0AzZ8JXJVBPgYnfeifItdhBsCTz7QtKGN4DbLZzDAVRL28mHNzaFBlCjCMGyhrbR3jmmlLWqj5K6lbfdoS0jxbLDKkNY8ywVq61rua1YHe-J6ZOY40ESL_0hf27KTgIJFSNq99yGX2sMw=-w150-h150-c
https://lh3.googleusercontent.com/proxy/ciJtTGr7RBlyJD2JGS-Ps4OUOcHxph6Csa1RCJOhcIjjqnjMHnqDzBU2MwEBoieNz37zmC69cPcoUi9696CfpMYh_cry5O6xmTT-1BnlyJhGMTeDR2mIbf3-3VEJ5YsNHmyozvClGvbR6_ZOaMgH0w3AWwf_bZppmqXp7Bul8rXDOkIDMeHrmKCpQcJff-lAbV7hnud3h0JH0--7zw5wCCOj57dNwIA=-w150-h150-c
https://lh5.googleusercontent.com/proxy/nyoQUYC-IPq8Td_GPYc60euyS5cgKwUk7ta1ISJl9wafgrGt1HkhVtPSpoO36KZl_8em4B9bBuP_OXmR0RZlGk1yLwcfAK42NknrGy5H0bLwJqouJ0sE0a21EwardDsVGe1XhGXETO92NfSG2Tikpl9pUFLiJEE-ySdL2d5CK9LA82P4DG6FM5eW=-w150-h150-c
https://lh3.googleusercontent.com/proxy/F2QF_T-xeBkwMyMljtdxwaXMQmNvyG9YCv2QmdBBSmNCe6okG7AIuElWuXXI45IjTA8fuyEZbeGEHBJdWIeyxcjiwapXjzAIxm1lxrmMdzLgJyD4C867KZtcTS1NTWyJebHY4u3gBQ2Z=-w150-h150-c
https://lh4.googleusercontent.com/proxy/rbbBxl7QWG4BBIlsvJUIfL3fr8j4f7L_LoRc3NvfWcNOGT7oX7U1_CpJ71oE04TD0Ax75WmDJrlBQNYPGcsQnOid874Z6P3nwNpdNtZlytX_6FlXqefr58IQ3fB-sivfI20EmQVnRfaBXUozjpHbjW225jeI1hxWc1U15MC_rMuJryLEC_CV4OLg=-w150-h150-c
https://lh5.googleusercontent.com/proxy/7TRaIMikpkiHsWKNenXItiKUUSnRhEl92XSOmDHl828VWobr_M8crMxMLvfG9BDbVc4SioxINBmDyDLhxyHiLlk_c_-6ocsm6ATjrUbWHuc-FTba=-w150-h150-c
https://lh5.googleusercontent.com/proxy/VDXcSNdIqLdhl6IFXyQlwOlzDlhPGaPTOWN0XMuX_DHozxXowzuQMWGAnFNIizgXavLZ5rVcw9rGl7NWTHMyRboJVqjzRRtoOs1GJCb0dylyUsHSt5qUSOZULdyBDPkW6HXVDHHyR40EeR4CS9nOX0M=-w150-h150-c
https://lh5.googleusercontent.com/proxy/YlbnCDloJxYZBVFhx-k2JZzYzFhcBA6DMAim5QTgNoyB4-Q_8DjgR2-JV73ARnUmpROHYfdZiKwfEUCdPB7tUSJ-uJuSRFgfQ_t8CV6rQ8zAXiIKOuoNO20AMArh5NXKr99nP_FoiOLf6mIEJw--URXUP0Tg4-i7bCXXdIPIvVpWNaDxKesNa1MRzIWkzHnYoGuy5QGL7byi32Y0ld8mHP2KFbXT2aga0f3S5rl-ikFkRxpaSxk0coE=-w150-h150-c
https://lh6.googleusercontent.com/proxy/XgDFopqdfCYmrzYi5NKRDYZYZgmJ2fyAn-9eN9QnXgmBezTviTAYV4ct07peVeZMerMND3ZwgZ-lK8Uv7B6FivV6LXqAJN4E7OVvtKYToSriuCRK4QOTuf6oFXG0KTetG-QJCoHZT77mWJtCGb0jG9tch59MJ0aWWZ1NA5X7wF4aMtoLSHkvAK2cuspI85CpPMj2ayu6wiG-0GT8fcAwRsVW64773g=-w150-h150-c
https://lh6.googleusercontent.com/proxy/WTq4Z72Acy7ykLcNmv9b_B2IQerfE4M7V00dxJ1o65IBz4OPyDzKkDBrVLAvqKdSjOHuTsHwTw5_UBINqKU3tFIOeEjCdsOs3yLkqG52YI-NGYvtw0EUohgu6Ps=-w150-h150-c
https://lh6.googleusercontent.com/proxy/uyiP15MOcfSPSHn6FU_LMK08w-0WiaDKQwC0e5rC67ZmdEqYqDYQDjHCkkH0UB0vxLRNpUw0jyz_YCvZ713hPuAeZM4xk7ZItIzNnkXRv_H-n8196YojFvVHZjbRYNZguxHtN7uAT1hxNvrRlRL-pdG9eiUs58rx_w=-w150-h150-c
https://lh4.googleusercontent.com/proxy/MBgkZaPJYX173801aVCotrRKb5pTCO3orWgu0cgsCuglj0bAQW7Za7HhaQUVk2NIUxg8MQz0qiizFGBTSZXdcUwpsmTPpz2rXYuZpqxgSAsxjZH4RUZW9P74EuM-_KzJQ7fKZx7sVK26kLxXr206i6DWH8LCCFa9utvSWevAS2OlYSYc17a09Z6Af5NJ7FoIJ6jxh1wnsuZhfwebq_4zQGDg4AF9XVrKayVrYExcBtx-kYqgjAqTiswm7YoO1yJGBUNkXh1aXC76C50WEOiYemdoXeygD1KIsRpExccCQZZ_NDjIgRMjBvPAGy-1Ue1xwCiXvgFu1P7AtZ2g0XA=-w150-h150-c
https://lh6.googleusercontent.com/proxy/9TiiPwVKm7hFE4kzBEeQ2nrwR8h3hjJ1KLMw6G5s-3_FzO0QorKfKYkubFvF-YDgMYyujUloeApsM00xuBWgQYJE4vRrcdXFoAik032DyRykwAYl7e87Qjqb2wnNMUpfmX1PTZo_OK6VC69sGQY9ISWevaI09tKI6w=-w150-h150-c
https://lh4.googleusercontent.com/proxy/bePdkudsqFAyihVJPg94KH4SKhVTwB0BSFiXEsCdQEZubli_o9RtbEctWtw1X5CC_x9JqM9vBPhWMyP29eBmkrCzi04osGEwiTOoaeLI3WhPl49w0UKeNvOONgkK3ZbPhJ4RFutptw71nhLoiWX2uAUDChy_zgxYmg=-w150-h150-c
https://lh5.googleusercontent.com/proxy/GgkEfaSJZU_ZrtKUpU3aqKBS-u1fUkz3wDJH7m3iPlFduw_C0yfprGaOGubYqxBdILL69inogeniN3zM83hh2EaBGS5wNJ8PfmSe1bUOy605NxMR19SPSgeLu0hGlrc_d16v-V1OtwxDt2SVDM382UQ=-w150-h150-c
https://lh3.googleusercontent.com/proxy/C04R8mKirMGoXC7SvbOwbMAh2BpjAUcqRhyFEoZhIg2bl55t5jgF6zLwXvAAe4O95ETW1fp1sIQSGgzCoxlFBp51LCEzyfvDiKkxt0LpYzHmNTeIxmGTlmBkRv4wRNGquW0ZBp1AWnjoqaGosgMWv0kQI6QTkgFTEI5tuhrLppr_0Xcfy4JNSqo8bSVxa_fb27Iih5Evf1RUSS6Umnc6wW3cHip0icT7QmfdebQs3LUvSUqHaaeDikc60NOQdZRX8tUcODic84RoOn41vM8NvBrZZuQgImC8GTPavaMTUIsEOK4QxSNUdxKduRcPpa0p=-w150-h150-c
https://lh4.googleusercontent.com/proxy/qpUSiYScbVAjAv63NKloqPadlLXD7xWo0eocfLMerlUozukyVTS4QWZYcJBPmkHuxJZCh85Zh1mEepVEeZH3JSMxQRrcE-4Apawmnw=-w150-h150-c
https://lh6.googleusercontent.com/proxy/CNvMvcbMHHckCXbyFXkRZnnuPr17TEzvspLGwIobu15dDsrlHt-3QKzi7kcHKvTpJZlCr2l-HhWOPBfJJzPRrd2gn734awWdE5jXRcqrnfly4bwnIPokO68_luWur73lg-k=-w150-h150-c
https://lh4.googleusercontent.com/proxy/eDks_caYi82LNyG2L2AMQENYCuL7LfaH5rhL-qT6QiVnb5r142EFLTe4u61mZf-1xE2UkJB9GfcUy4x5IfNOU-JCd78FMn--f1CldoEY9y7ouU5cdZ8=-w150-h150-c
https://lh6.googleusercontent.com/proxy/WsWb8Woo-ogEIuDkvaBpsxrizSDUwM_k6h9w1ma-d_i4f3c9Bpefe6llemcMlZNODP5hx_raBrZ6dlclfDXJpirHGgVuTFp3W_mCdrGWO1LCsQf6Nz3iyjgJIbFFv12K3rC9sy2sfV3kgpQRURxi50MwLLG4lUcDx8LIiHWk5bG-VR9IBpygAMPtL5LJoRN8fkg9Vh7RA-J8kkbDm8-xirGXhkYheENaly7yH3qpIo_3aBYrHzS1GsCULOfpjdEuw2OISw=-w150-h150-c

Array filter in PHP

I am using a simple html dom to parsing html file.
I have a dynamic array called links2, it can be empty or maybe have 4 elements inside or more depending on the case
<?php
include('simple_html_dom.php');
$url = 'http://www.example.com/';
$html = file_get_html($url);
$doc = new DOMDocument();
#$doc->loadHTML($html);
//////////////////////////////////////////////////////////////////////////////
foreach ($doc->getElementsByTagName('p') as $link)
{
$intro2 = $link->nodeValue;
$links2[] = array(
'value' => $link->textContent,
);
$su=count($links2);
}
$word = 'document.write(';
Assuming that the two elements contain $word in "array links2", when I try to filter this "array links2" by removing elements contains matches
unset( $links2[array_search($word, $links2 )] );
print_r($links2);
the filter removes only one element and array_diff doesn't solve the problem. Any suggestion?

solved by adding an exception
foreach ($doc->getElementsByTagName('p') as $link)
{
$dont = $link->textContent;
if (strpos($dont, 'document') === false) {
$links2[] = array(
'value' => $link->textContent,
);
}
$su=count($links2);
echo $su;

Parsing RSS with PHP

I'm trying to parse RSS: http://www.mlssoccer.com/rss/en.xml .
$feed = new DOMDocument();
$feed->load($url)
$items = $feed->getElementsByTagName('channel')->item(0)->getElementsByTagName('item');
foreach($items as $key => $item)
{
$title = $item->getElementsByTagName('title')->item(0)->firstChild->nodeValue;
$pubDate = $item->getElementsByTagName('pubDate')->item(0)->firstChild->nodeValue;
$description = $item->getElementsByTagName('description')->item(0)->firstChild->nodeValue;
// do some stuff
}
The thing is: I'm getting "$title" and "$pubDate" without a problem, but for some reason "$description" is always empty, there's nothing in it. What could be the reason for such behaviour and how to fix it?

The problem was with CDATA you need to use textContent instead of nodeValue to retreive value beetween
<?php
$feed = new DOMDocument();
$feed->load('http://www.mlssoccer.com/rss/en.xml');
$items = $feed->getElementsByTagName('channel')->item(0)->getElementsByTagName('item');
foreach($items as $key => $item)
{
$title = $item->getElementsByTagName('title')->item(0)->firstChild->nodeValue;
$pubDate = $item->getElementsByTagName('pubDate')->item(0)->firstChild->nodeValue;
$description = $item->getElementsByTagName('description')->item(0)->textContent; // textContent
}

Here can be whitespaces between the opening <description> tag and the opening <![CDATA[. This is a text node.
So if you access the firstChild of description, you might fetch that whitespace text node.
In a generic way you can set the DOMdocument to ignore whitespace nodes:
$feed = new DOMDocument();
$feed->preserveWhiteSpace = FALSE;
$feed->load($url);
Additionally you should check out XPath, it makes reading a DOM much easier:
$xpath = new DOMXpath($feed);
foreach ($xpath->evaluate('//channel/item') as $item) {
$title = $xpath->evaluate('string(title)', $item);
$pubDate = $xpath->evaluate('string(pubDate)', $item);
$description = $xpath->evaluate('string(description)', $item);
// do some stuff
var_dump([$title, $pubData, $description]);
}

Parsing xml with simplexml_load

I am trying to parse an xml but I get a problem while I am trying to fetch image url.
My xml is:
<entry>
<title>The Title</title>
<id>http://example.com/post/367327.html</id>
<summary>Some extra text</summary>
<link rel="enclosure" href="http://example.com/photos/f_0px_30px/image687.jpg" type="image/jpeg" length="" />
</entry>
So far I am using the code below to fetch the other data:
$url = "http://msdssite.com/feeds/xml/myxml.xml";
$xml = simplexml_load_file($url);
foreach($xml->entry as $PRODUCT)
{
$my_title = trim($PRODUCT->title);
$url = trim($PRODUCT->id);
$myimg = $PRODUCT->link;
}
How can I parse the href from this: <link rel="enclosure" href="http://example.com/photos/f_0px_30px/image687.jpg" type="image/jpeg" length="" />

Since it seems that your entries can contain several link tags, you need to check that the type attribute has the value image/jpeg to be sure to obtain a link to an image:
ini_set("display_errors", "On");
$feedURL = 'http://OLDpost.gr/feeds/xml/category-takhs-xatzhs.xml';
$feed = simplexml_load_file($feedURL);
$results = array();
foreach($feed->entry as $entry) {
$result = array('title' => (string)$entry->title,
'url' => (string)$entry->id);
$links = $entry->link;
foreach ($links as $link) {
$linkAttr = $link->attributes();
if (isset($linkAttr['type']) && $linkAttr['type']=='image/jpeg') {
$result['img'] = (string)$linkAttr['href'];
break;
}
}
$results[] = $result;
}
print_r($results);
Note that using simplexml like that (the foreach loop to find the good link tag) isn't very handy. It's better to use an XPath query:
foreach($feed->entry as $entry) {
$entry->registerXPathNamespace('e', 'http://www.w3.org/2005/Atom');
$results[] = array(
'title' => (string)$entry->title,
'url' => (string)$entry->id,
'img' => (string)$entry->xpath('e:link[#type="image/jpeg"]/#href')[0][0]
);
}

If that's the exact XML, actually there is no need for a foreach. Try this:
$xml = simplexml_load_file($url);
$my_title = (string) $xml->title;
$myimg = (string) $xml->link->attributes()['href']; // 5.4 or above
echo $myimg; // http://example.com/photos/f_0px_30px/image687.jpg

Try:
foreach($xml->entry as $PRODUCT)
{
$my_title = trim($PRODUCT->title[0]);
$url = trim($PRODUCT->id[0]);
$myimg = $PRODUCT->link[0];
}

RSS parsing with PHP and SimpleXML: How to enter namespaced items?

I am parsing the following RSS feed (relevant part shown)
<item>
<title>xxx</title>
<link>xxx</link>
<guid>xxx</guid>
<description>xxx</description>
<prx:proxy>
<prx:ip>101.226.74.168</prx:ip>
<prx:port>8080</prx:port>
<prx:type>Anonymous</prx:type>
<prx:ssl>false</prx:ssl>
<prx:check_timestamp>1369199066</prx:check_timestamp>
<prx:country_code>CN</prx:country_code>
<prx:latency>20585</prx:latency>
<prx:reliability>9593</prx:reliability>
</prx:proxy>
<prx:proxy>...</prx:proxy>
<prx:proxy>...</prx:proxy>
<pubDate>xxx</pubDate>
</item>
<item>...</item>
<item>...</item>
<item>...</item>
Using the php code
$proxylist_rss = file_get_contents('http://www.xxx.com/xxx.xml');
$proxylist_xml = new SimpleXmlElement($proxylist_rss);
foreach($proxylist_xml->channel->item as $item) {
var_dump($item); // Ok, Everything marked with xxx
var_dump($item->title); // Ok, title
foreach($item->proxy() as $entry) {
var_dump($entry); //empty
}
}
While I can access everything marked with xxx, I cannot access anything inside prx:proxy - mainly because : cannot be present in valid php varnames.
The question is how to reach prx:ip, as example.
Thanks!

Take a look at SimpleXMLElement::children, you can access the namespaced elements with that.
For example: -
<?php
$xml = '<xml xmlns:prx="http://example.org/">
<item>
<title>xxx</title>
<link>xxx</link>
<guid>xxx</guid>
<description>xxx</description>
<prx:proxy>
<prx:ip>101.226.74.168</prx:ip>
<prx:port>8080</prx:port>
<prx:type>Anonymous</prx:type>
<prx:ssl>false</prx:ssl>
<prx:check_timestamp>1369199066</prx:check_timestamp>
<prx:country_code>CN</prx:country_code>
<prx:latency>20585</prx:latency>
<prx:reliability>9593</prx:reliability>
</prx:proxy>
</item>
</xml>';
$sxe = new SimpleXMLElement($xml);
foreach($sxe->item as $item)
{
$proxy = $item->children('prx', true)->proxy;
echo $proxy->ip; //101.226.74.169
}
Anthony.

I would just strip out the "prx:"...
$proxylist_rss = file_get_contents('http://www.xxx.com/xxx.xml');
$proxylist_rss = str_replace('prx:', '', $proxylist_rss);
$proxylist_xml = new SimpleXmlElement($proxylist_rss);
foreach($proxylist_xml->channel->item as $item) {
foreach($item->proxy as $entry) {
var_dump($entry);
}
}
http://phpfiddle.org/main/code/jsz-vga

Try it like this:
$proxylist_rss = file_get_contents('http://www.xxx.com/xxx.xml');
$feed = simplexml_load_string($proxylist_rss);
$ns=$feed->getNameSpaces(true);
foreach ($feed->channel->item as $item){
var_dump($item);
var_dump($item->title);
$proxy = $item->children($ns["prx"]);
$proxy = $proxy->proxy;
foreach ($proxy as $key => $value){
var_dump($value);
}
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to get elements with custom name from rss? - php

The Tag name here is "encoded". Just use $content => $node->getElementsByTagName('encoded')->item(0)->nodeValue

Related

How to retrieve an attribute from a namespaced element

Array filter in PHP

Parsing RSS with PHP

Parsing xml with simplexml_load

RSS parsing with PHP and SimpleXML: How to enter namespaced items?

Categories

Resources