I have a form which on POST creates an XML based on the values in the form. I am using DOM DOCUMENT to create XML. I am giving exact element names when creating xml but when it is created the element names are in lowercase which my API is not accepting. for eg.
I am giving this as an input <LSP_Name>JaVAS</LSP_Name> and when it is created it forms like this <lsp_name>JaVAS</lsp_name>
I tried $xml->formatOutput=true; but had no luck.
Any one who had faced this similar issue ?
DOCUMENT CREATION
$xml = new DOMDocument('1.0', 'utf-8');
$xml->formatOutput=true;
$root = $xml->createElement("NewDataSet");
$xml->appendChild($root);
$LSP_Code = $xml->createElement("LSP_Code");
$LSP_CodeText = $xml->createTextNode(LSP_CODE);
$LSP_Code->appendChild($LSP_CodeText);
$docket = $xml->createElement("Docket");
$root->appendChild($docket);
$docket->appendChild($LSP_Code);
TIA
Related
I am pulling HTML from Selenium, and then extracting data from the HTML using Xpaths.
This is the Xpath:
/html/body/div[2]/div[1]/div/div/div/div/ul/li/div[1]/h3/a
This is my code:
$data = $webdriver->getPageSource();
d($data, $urltemplate);
$doc = new DOMDocument();
$doc->loadHTML($data);
$xp = "/html/body/div[2]/div[1]/div/div/div/div/ul/li/div[1]/h3/a";
$xpatho = new DOMXpath($doc);
$elementsn = $xpatho->query($xp);
d(get_class($elementsn),$elementsn->count(),$xp,$name);
// d() is a custom function like var_dump().
I always get $elementsn->count() = 0.
This is $data:
https://pastebin.com/ahuvkJfN
I am trying to extract those strings like "NAD M10 BLUOS...", "NAD M12 DIRECT DIGITAL..." and so on...
I saved the HTML into a file, and opened it in my browser. I am attaching screenshot of what data I was looking to retrieve (highlighted in blue):
Basically, the HTML page is a product listing, and I am looking to extract all the product names. To confirm, I used Chrome Developer tools, and used the copy full Xpath function. I have the following Xpaths for some of the product names:
/html/body/div[2]/div[1]/div/div/div/div/ul/li[1]/div[1]/h3/a
/html/body/div[2]/div[1]/div/div/div/div/ul/li[3]/div[1]/h3/a
I would guess that this would generalise to:
/html/body/div[2]/div[1]/div/div/div/div/ul/li/div[1]/h3/a
However, I keep on getting a DOMNodeList with count = 0. Why is this so, and how can I check what the error is, if any?
P.S.: This is the original webpage: http://lenbrook.com.sg/3-shop-by-brand#/page-4/price-49-8667
Try changing your $xp
$xp = '//a[#class="product_link"]/text()'
Am wondering if someone can help in the best method to accomplish this.
We have a XML with various data sets, Within the XML is a Set of Data per Channel.
Each Channel runs its own program.
What i was hoping to do, is if ChannelA was to process the XML
And it had a error for one reason or another, Can i simply extract that NODE set it was processing and build a XML from it.
or do i have to declare each NODE set to build the XML then essentially import it?
The XML has around 30/40 Nodes (Am posting like this as an example) so typing these out in PHP is just going to be ugly, and down the line as we add more sets, its just going to be horrible to maintain it.
<data>
<MainUpdate>
<Chan>5</Chan>
<Data1></<Data1>
<Data2></<Data2>
<Data3></<Data3>
<Data4></<Data4>
<Data5></<Data5>
</MainUpdate>
<MainUpdate>
<Chan>8</Chan>
<Data1></<Data1>
<Data2></<Data2>
<Data3></<Data3>
<Data4></<Data4>
<Data5></<Data5>
</MainUpdate>
<MainUpdate>
<Chan>10</Chan>
<Data1></<Data1>
<Data2></<Data2>
<Data3></<Data3>
<Data4></<Data4>
<Data5></<Data5>
</MainUpdate>
</data>
If Channel8 processes this XML, and its only processing Channel8 Data
I want to be able to create a XML with just:
<data>
<MainUpdate>
<Chan>8</Chan>
<Data1></<Data1>
<Data2></<Data2>
<Data3></<Data3>
<Data4></<Data4>
<Data5></<Data5>
</MainUpdate>
</data>
Without delcaring all the nodes
use xpath to select the desired node (and its children) and
importNode() to get it to a new DOMDocument
code example:
$xpath = new DOMXpath($doc); // assume original XML in $doc
$e = $xpath->query("/data/MainUpdate[Chan = '8']")->item(0);
Note the condition in [], assuming <Chan> as unique, the resulting DOMNodeList will have 1 DOMElement, of which we grab the first and only and store it in $e.
$e is NULL if there is no such <Chan>, check this before proceeding.
Now create a new document with <data></data> as root and import $e:
$newdoc = new DOMDocument();
$newdoc->loadXML("<data />");
$e = $newdoc->importNode($e, true);
$newdoc->documentElement->appendChild($e);
see it working: https://eval.in/513276
im new to XML Handling in PHP
i wrote this script to get variables from post and insert them into a tag with the name of the variable it self from the posted variables and the data the actual data inside this text fields
i have set $id_picture to foo instead of the posted data but with same result
$xml = new DOMDocument('1.0', 'UTF-8');
$id_picture = $_POST['id_picture'];
$xml_id_picture = $xml->createElement("id_picture");
$xml_id_picture_node = $xml->createTextNode($id_picture);
$xml_id_picture->appendChild($xml_id_picture_node);
//upload xml
$xml->save('xml.xml');
what im trying to achieve is save the data from the post to the first variable then i get lost on making it a xml tag and inserting the data in between
<id_picture>foo</id_picture>
You never inserted your new node into the main object. You need something like
$xml->appendChild($xml_id_picture);
so that your newly created id_picture node will actually show up in your document.
You have to add the node to the document.
$xml = new DOMDocument('1.0', 'UTF-8');
$id_picture = $_POST['id_picture'];
$xml_id_picture = $xml->createElement("id_picture");
$xml_id_picture_node = $xml->createTextNode($id_picture);
$xml_id_picture->appendChild($xml_id_picture_node);
$xml->appendChild($xml_id_picture);//<-- here
//upload xml
$xml->save('xml.xml');
http://codepad.org/6Ml2cUYe
It looks like you are creating an element, appending a node to it, but then you are not appending the element to the document.
I have installed a JSON plugin and got the content of HTML page. Now I want to parse and find a particular table, which has only class, but no id. I parse it using the PHP class DOMDocument.I have the idea to access the tag before the table and after that somehow to access the next following tag(my table) using DOMDocument.
Example:
<a name="Telefonliste" id="Telefonliste"></a>
<table class="wikitable">
So, i get fist the <a> and after that I get <table>.
I have got all the tables using the following commands and especially getElementsByTagName(). After that I can access item(2) where my table is:
$dom = new DOMDocument();
//load html source
$html = $dom->loadHTML($myHtml);
//discard white space
$dom->preserveWhiteSpace = false;
//the table by its tag name
$table = $dom->getElementsByTagName('table');
$rows = $table->item(2)->getElementsByTagName('tr');
This way is ok, but I want to make it more general, because now I know that the table is located in item(2), but the location can be changed e.g if a new table is included in the HTML page before my table. My table will not be in item(2), but in item(3). So, I want it it to parse in a way that I can still reach this table without changing something in my code. Can I do it using DOMDocument as a DOM parser?
You can use DOMXPath, and make the expression as general as you need it.
For example:
$dom = new DOMDocument();
//discard white space
$dom->preserveWhiteSpace = false;
//load html source
$dom->loadHTML($myHtml);
$domxpath = new DOMXPath($dom);
$table = $domxpath->query('//table[#class="wikitable" and not(#id)][0]')->item(0);
$elementBeforeTable = $table->previousSibling;
$rows = $table->getElementsByTagName('tr');
I've started writing a simple extension of this for the purpose of web scraping. I'm not 100% on the direction I want to take with it yet, but you can see an example of how to get the original HTML back in the response of the search rather than just raw text.
https://github.com/WolfeDev/PageScraper
EDIT: I plan on implementing basic table parsing soon.
I am attempting to send html data in a question form from my php web application to mechanical turk so a user can see the entire html document from an email to work with. I have had difficulty thus far. In the thread linked below, I attempted to parse the html data using html5-lib.php, but I think I'm still missing a step in order to complete this.
Here's the current error I'm receiving:
Catchable fatal error: Object of class DOMNodeList could not be converted to string in urlgoeshere.php on line 35
Here's the current code I'm working with...
$thequestion = 'click here';
$thequestion = HTML5_Parser::parseFragment($thequestion);
var_dump($thequestion);
echo $thequestion;
//htmlspecialchars($thequestion);
$QuestionXML = '<QuestionForm xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionForm.xsd">
<Question>
<QuestionContent>
<Text>'.$thequestion.'</Text> //<--- Line35
</QuestionContent>
<AnswerSpecification>
<FreeTextAnswer/>
</AnswerSpecification>
</Question>
</QuestionForm> ';
I'm not 100% sure if the parser is what I need to do in order to have this sent correctly - All I want to do is send html through this xml type document - I'm surprised it's been this difficult thus far.
This is somewhat of a continuation of another thread -
What PHP code will help me parse html data in an xml form?
Take a look at DOMDocument for working with DOM/xml in PHP. If you want to embed HTML in your XML, use CDATA sections like this:
$QuestionXML = '<QuestionForm xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionForm.xsd">
<Question>
<QuestionContent>
<Text><![CDATA['.$thequestion.']]></Text>
</QuestionContent>
<AnswerSpecification>
<FreeTextAnswer/>
</AnswerSpecification>
</Question>
</QuestionForm> ';
Not sure exactly what you're after. This is how i would create the XML needed to be transferred.
Please let me know if i misunderstood the question
It also seems that you are missing the QuestionIdentifier node which is mandatory according to the .xsd file.
<?
$dom = new DOMDocument('1.0','UTF-8');
$dom->formatOutput = true;
$QuestionForm = $dom->createElement('QuestionForm');
$QuestionForm->setAttribute('xmlns','http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionForm.xsd');
// could loop this part for all the questions of the XML
$thequestion = 'click here';
//Not sure what this is supposed to be, but its required. Check the specs of the app for it.
$questionID = "";
$Question = $dom->createElement('Question');
$QuestionIdentifier = $dom->createElement('QuestionIdentifier',$questionID);
$QuestionContent = $dom->createElement('QuestionContent');
$QuestionContent->appendChild($dom->createElement('Text',$thequestion));
$AnswerSpecification = $dom->createElement('AnswerSpecification');
$AnswerSpecification->appendChild($dom->createElement('FreeTextAnswer'));
$Question->appendChild($QuestionIdentifier);
$Question->appendChild($QuestionContent);
$Question->appendChild($AnswerSpecification);
$QuestionForm->appendChild($Question);
// End loop
$dom->appendChild($QuestionForm);
$xmlString = $dom->saveXML();
print($xmlString);
?>