I'm creating a "Madlibs" page where visitors can create funny story things online. The original files are in XML format with the blanks enclosed in XML tags
(Such as blablabla <PluralNoun></PluralNoun> blablabla <Verb></Verb> ).
The form data is created using XSL and the results are saved using a $_POST array. How do I post the $_POST array between the matching XML tags and then display the result to the page? I'm sure it uses a "foreach" statement, but I'm just not familiar enough with PHP to figure out what functions to use. Any help would be great.
Thanks,
E
I'm not sure if I understood your problem quite well, but I think this might help:
// mocking some $_POST variables
$_POST['Verb'] = 'spam';
$_POST['PluralNoun'] = 'eggs';
// original template with blanks (should be loaded from a valid XML file)
$xml = 'blablabla <PluralNoun></PluralNoun> blablabla <Verb></Verb>';
$valid_xml = '<?xml version="1.0"?><xml>' . $xml . '</xml>';
$doc = DOMDocument::loadXML($valid_xml, LIBXML_NOERROR);
if ($doc !== FALSE) {
$text = ''; // used to accumulate output while walking XML tree
foreach ($doc->documentElement->childNodes as $child) {
if ($child->nodeType == XML_TEXT_NODE) { // keep text nodes
$text .= $child->wholeText;
} else if (array_key_exists($child->tagName, $_POST)) {
// replace nodes whose tag matches a POST variable
$text .= $_POST[$child->tagName];
} else { // keep other nodes
$text .= $doc->saveXML($child);
}
}
echo $text . "\n";
} else {
echo "Failed to parse XML\n";
}
Here is PHP foreach syntax. Hope it helps
$arr = array('fruit1' => 'apple', 'fruit2' => 'orange');
foreach ($arr as $key => $val) {
echo "$key = $val\n";
}
and here is the code to loop thru your $_POST variables:
foreach ($_POST as $key => $val) {
echo "$key = $val\n";
// then you can fill each POST var to your XML
// maybe you want to use PHP str_replace function too
}
Related
I've been trying unsuccessfully with PHP to loop through two XML files and print the result to the screen. The aim is to take a country's name and output its regions/states/provinces as the case may be.
The first block of code successfully prints all the countries but the loop through both files gives me a blank screen.
The countries file is in the format:
<row>
<id>6</id>
<name>Andorra</name>
<iso2>AD</iso2>
<phone_code>376</phone_code>
</row>
And the states.xml:
<row>
<id>488</id>
<name>Andorra la Vella</name>
<country_id>6</country_id>
<country_code>AD</country_code>
<state_code>07</state_code>
</row>
so that country_id = id.
This gives a perfect list of countries:
$xml = simplexml_load_file("countries.xml");
$xml1 = simplexml_load_file("states.xml");
foreach($xml->children() as $key => $children) {
print((string)$children->name); echo "<br>";
}
This gives me a blank screen except for the HTML stuff on the page:
$xml = simplexml_load_file("countries.xml");
$xml1 = simplexml_load_file("states.xml");
$s = "Jamaica";
foreach($xml->children() as $child) {
foreach($xml1->children() as $child2){
if ($child->id == $child2->country_id && $child->name == $s) {
print((string)$child2->name);
echo "<br>";
}
}
}
Where have I gone wrong?
Thanks.
I suspect your problem is not casting the name to a string before doing your comparison. But why are you starting the second loop before checking if it's needed? You're looping through every single item in states.xml needlessly.
$countries = simplexml_load_file("countries.xml");
$states = simplexml_load_file("states.xml");
$search = "Jamaica";
foreach($countries->children() as $country) {
if ((string)$country->name !== $search) {
continue;
}
foreach($states->children() as $state) {
if ((string)$country->id === (string)$state->country_id) {
echo (string)$state->name . "<br/>";
}
}
}
Also, note that naming your variables in a descriptive manner makes it much easier to figure out what's going on with code.
You could probably get rid of the loops altogether using an XPath query to match the sibling value. I don't use SimpleXML, but here's what it would look like with DomDocument:
$search = "Jamaica";
$countries = new DomDocument();
$countries->load("countries.xml");
$xpath = new DomXPath($countries);
$country = $xpath->query("//row[name/text() = '$search']/id/text()");
$country_id = $country[0]->nodeValue;
$states = new DomDocument();
$states->load("states.xml");
$xpath = new DomXPath($states);
$states = $xpath->query("//row[country_id/text() = '$country_id']/name/text()");
foreach ($states as $state) {
echo $state->nodeValue . "<br/>";
}
When I try to Import xml file, it returns an error:
xmlParseCharRef: invalid xmlChar value 4" because of $amp; in xml <STOCKGROUP >NAME="ABC & Glass" RESERVEDNAME="">
Here is code I'm using to read xml file data
function add_product_type()
{
print_r($_FILES);
if (isset($_FILES['product_type_file']) && ($_FILES['product_type_file']['error'] == UPLOAD_ERR_OK)) {
$use_errors = libxml_use_internal_errors(true);
$response = simplexml_load_file($_FILES['product_type_file']['tmp_name']);
print_r($response);
foreach($response->BODY->IMPORTDATA as $key => $value) {
foreach($value->REQUESTDATA->TALLYMESSAGE as $key => $values) {
if (strstr("&", $values->STOCKGROUP->attributes())) {
$name = str_replace("&", "&", $values->STOCKGROUP->attributes());
}
else {
$name = $values->STOCKGROUP->attributes();
}
echo $name . ",";
}
}
if ($response == false) {
echo "Failed loading XML\n";
foreach(libxml_get_errors() as $error) {
echo "\t", $error->message;
}
}
}
}
XML has no notion of HTML entities. As a hack, you can decode the entities first with
$html = html_entity_decode($file_contents, ENT_QUOTES, "utf-8");
and then try parse $html with the XML parser. Just hope it's tolerant enough, because HTML is still not valid XML.
The good news is that you can then remove the if (strstr("&", hack because that is taken care of by html_entity_decode().
Your XML sample isn't the same data as your getting the error with and processes OK, but this doesn't mean that your code is correct. In
your main loop, you use $values->STOCKGROUP->attributes() as the name, but if you did a print_r() of this value, you will see it comes out with a list of all of the attribute values and your code will combine them all into one string.
SimpleXMLElement Object
(
[#attributes] => Array
(
[NAME] => Alluminium Section & Glass
[RESERVEDNAME] =>
)
)
If you just want the NAME attribute of <STOCKGROUP NAME="Alluminium Section & Glass" RESERVEDNAME="">, then use STOCKGROUP['NAME']. Also rather than just substitute &, use html_entity_decode() to convert any characters in the name.
foreach($response->BODY->IMPORTDATA as $key => $value) {
foreach($value->REQUESTDATA->TALLYMESSAGE as $key => $values) {
$name = html_entity_decode((string)$values->STOCKGROUP['NAME']);
echo $name . ",";
}
}
This code with the sample you provide outputs...
Alluminium Section & Glass,
Update:
Check the file being loaded...
$data = file_get_contents($_FILES['product_type_file']['tmp_name']);
echo $data;
$use_errors = libxml_use_internal_errors(true);
$response = simplexml_load_string($data);
echo $response->asXML();
I am somewhat new with PHP, but can't really wrap my head around what I am doing wrong here given my situation.
Problem: I am trying to get the href of a certain HTML element within a string of characters inside an XML object/element via Reddit (if you visit this page, it would be the actual link of the video - not the reddit link but the external youtube link or whatever - nothing else).
Here is my code so far (code updated):
Update: Loop-mania! Got all of the hrefs, but am now trying to store them inside a global array to access a random one outside of this function.
function getXMLFeed() {
echo "<h2>Reddit Items</h2><hr><br><br>";
//$feedURL = file_get_contents('https://www.reddit.com/r/videos/.xml?limit=200');
$feedURL = 'https://www.reddit.com/r/videos/.xml?limit=200';
$xml = simplexml_load_file($feedURL);
//define each xml entry from reddit as an item
foreach ($xml -> entry as $item ) {
foreach ($item -> content as $content) {
$newContent = (string)$content;
$html = str_get_html($newContent);
foreach($html->find('table') as $table) {
$links = $table->find('span', '0');
//echo $links;
foreach($links->find('a') as $link) {
echo $link->href;
}
}
}
}
}
XML Code:
http://pasted.co/0bcf49e8
I've also included JSON if it can be done this way; I just preferred XML:
http://pasted.co/f02180db
That is pretty much all of the code. Though, here is another piece I tried to use with DOMDocument (scrapped it).
foreach ($item -> content as $content) {
$dom = new DOMDocument();
$dom -> loadHTML($content);
$xpath = new DOMXPath($dom);
$classname = "/html/body/table[1]/tbody/tr/td[2]/span[1]/a";
foreach ($dom->getElementsByTagName('table') as $node) {
echo $dom->saveHtml($node), PHP_EOL;
//$originalURL = $node->getAttribute('href');
}
//$html = $dom->saveHTML();
}
I can parse the table fine, but when it comes to getting certain element's values (nothing has an ID or class), I can only seem to get ALL anchor tags or ALL table rows, etc.
Can anyone point me in the right direction? Let me know if there is anything else I can add here. Thanks!
Added HTML:
I am specifically trying to extract <span>[link]</span> from each table/item.
http://pastebin.com/QXa2i6qz
The following code can extract you all the youtube links from each content.
function extract_youtube_link($xml) {
$entries = $xml['entry'];
$videos = [];
foreach($entries as $entry) {
$content = html_entity_decode($entry['content']);
preg_match_all('/<span><a href="(.*)">\[link\]/', $content, $matches);
if(!empty($matches[1][0])) {
$videos[] = array(
'entry_title' => $entry['title'],
'author' => preg_replace('/\/(.*)\//', '', $entry['author']['name']),
'author_reddit_url' => $entry['author']['uri'],
'video_url' => $matches[1][0]
);
}
}
return $videos;
}
$xml = simplexml_load_file('reddit.xml');
$xml = json_decode(json_encode($xml), true);
$videos = extract_youtube_link($xml);
foreach($videos as $video) {
echo "<p>Entry Title: {$video['entry_title']}</p>";
echo "<p>Author: {$video['author']}</p>";
echo "<p>Author URL: {$video['author_reddit_url']}</p>";
echo "<p>Video URL: {$video['video_url']}</p>";
echo "<br><br>";
}
The code outputs in the multidimensional format of array with the elements inside are entry_title, author, author_reddit_url and video_url. Hope it helps you!
If you're looking for a specific element you don't need to parse the whole thing. One way of doing it could be to use the DOMXPath class and query directly the xml. The documentation should guide you through.
http://php.net/manual/es/class.domxpath.php .
Hello I'm new with domnode and i'm trying to check the values from an xml tree which loads ok.
Here is my code but I dont understand why is not working.
private function createCSV($xml, $f)
{
foreach ($xml->getElementsByTagName('*') as $item)
{
$hasChild = $item->hasChildNodes() ? true : false;
if(!$hasChild)
{
//echo 'Doesn\'t have children';
echo 'Value: ' . $item->nodeValue;
}
else
{
//echo 'Has children';
$this->createCSV($item, $f);
}
}
}
$item->nodeValue doesnt print anything to the browser.
I read the documentation but I can't see any mistake.
PS. $item->tagname doesnt work either.
UPDATE
whe using this: echo $item->ownerDocument->saveHTML($item);
I get the tags listed but i dont get the data inside(between the tags) like innerHTML in javascript.
UPDATE
sample xml data : http://pastebin.com/dkuUUC0Q
Text nodes are also considered child nodes, but you're only iterating element nodes (get Elements ByTagName). Because of this you're almost never getting into the 2nd condition.
Try this:
if(!$xml->hasChildNodes()){
printf('Value: %s', $xml->nodeValue);
return;
}
foreach($xml->childNodes as $item)
$this->createCSV($item, $f);
XPath version:
$xpath = new DOMXPath($xml);
$text = $xpath->query('//text()[normalize-space()]');
foreach($text as $node)
printf('Value: %s', $node->nodeValue);
Is it possible to use a foreach loop to scrape multiple URL's from an array? I've been trying but for some reason it will only pull from the first URL in the array and the show the results.
include_once('../../simple_html_dom.php');
$link = array (
'http://www.amazon.com/dp/B0038JDEOO/',
'http://www.amazon.com/dp/B0038JDEM6/',
'http://www.amazon.com/dp/B004CYX17O/'
);
foreach ($link as $links) {
function scraping_IMDB($links) {
// create HTML DOM
$html = file_get_html($links);
$values = array();
foreach($html->find('input') as $element) {
$values[$element->id=='ASIN'] = $element->value; }
// get title
$ret['ASIN'] = end($values);
// get rating
$ret['Name'] = $html->find('h1[class="parseasinTitle"]', 0)->innertext;
$ret['Retail'] =$html->find('b[class="priceLarge"]', 0)->innertext;
// clean up memory
//$html->clear();
// unset($html);
return $ret;
}
// -----------------------------------------------------------------------------
// test it!
$ret = scraping_IMDB($links);
foreach($ret as $k=>$v)
echo '<strong>'.$k.'</strong>'.$v.'<br />';
}
Here is the code since the comment part didn't work. :) It's very dirty because I just edited one of the examples to play with it to see if I could get it to do what I wanted.
include_once('../../simple_html_dom.php');
function scraping_IMDB($links) {
// create HTML DOM
$html = file_get_html($links);
// What is this spaghetti code good for?
/*
$values = array();
foreach($html->find('input') as $element) {
$values[$element->id=='ASIN'] = $element->value;
}
// get title
$ret['ASIN'] = end($values);
*/
foreach($html->find('input') as $element) {
if($element->id == 'ASIN') {
$ret['ASIN'] = $element->value;
}
}
// Our you could use the following instead of the whole foreach loop above
//
// $ret['ASIN'] = $html->find('input[id="ASIN"]', 0)->value;
//
// if the 0 means, return first found or something similar,
// I just had a look at Amazons source code, and it contains
// 2 HTML tags with id='ASIN'. If they were following html-regulations
// then there should only be ONE element with a specific id.
// get rating
$ret['Name'] = $html->find('h1[class="parseasinTitle"]', 0)->innertext;
$ret['Retail'] = $html->find('b[class="priceLarge"]', 0)->innertext;
// clean up memory
//$html->clear();
// unset($html);
return $ret;
}
// -----------------------------------------------------------------------------
// test it!
$links = array (
'http://www.amazon.com/dp/B0038JDEOO/',
'http://www.amazon.com/dp/B0038JDEM6/',
'http://www.amazon.com/dp/B004CYX17O/'
);
foreach ($links as $link) {
$ret = scraping_IMDB($link);
foreach($ret as $k=>$v) {
echo '<strong>'.$k.'</strong>'.$v.'<br />';
}
}
This should do the trick
I have renamed the array to 'links' instead of 'link'. It's an array of links, containing link(s), therefore, foreach($link as $links) seemed wrong, and I changed it to foreach($links as $link)
I really need to ask this question as it will answer way more questions after the world reads this thread. What if ... you used articles like the simple html dom site.
$ret['Name'] = $html->find('h1[class="parseasinTitle"]', 0)->innertext;
$ret['Retail'] = $html->find('b[class="priceLarge"]', 0)->innertext;
return $ret;
}
$links = array (
'http://www.amazon.com/dp/B0038JDEOO/',
'http://www.amazon.com/dp/B0038JDEM6/',
'http://www.amazon.com/dp/B004CYX17O/'
);
foreach ($links as $link) {
$ret = scraping_IMDB($link);
foreach($ret as $k=>$v) {
echo '<strong>'.$k.'</strong>'.$v.'<br />';
}
}
what if its $articles?
$articles[] = $item;
}
//print_r($articles);
$links = array (
'http://link1.com',
'http://link2.com',
'http://link3.com'
);
what would this area look like?
foreach ($links as $link) {
$ret = scraping_IMDB($link);
foreach($ret as $k=>$v) {
echo '<strong>'.$k.'</strong>'.$v.'<br />';
}
}
Ive seen this multiple links all over stackoverflow for past 2 years, and I still cannot figure it out. Would be great to get the basic handle on it to how the simple html dom examples are.
thx.
First time postin im sure I broke a bunch of rules and didnt do the code section right. I just had to ask this question badly.