Here's the code:
for($i = 0; $i < count(array_values($resources['titles'])); $i++){
//var_dump($key);
$ad = $xml->addChild('ad');
$ad->addChild('title', htmlentities(htmlspecialchars(substr($resources['titles'][$i], 0, 70))));
$ad->addChild('text', 'Текст текст');
$ad->addChild('price', htmlentities($resources['prices'][$i]));
//file_put_contents('test.txt',$resources['titles'][$i]."\n", FILE_APPEND);
}
$xml->asXML($this->_xmlOutput);
It saves all the data okay, but xml file is not formatted well and cyrillic symbols (theres a lot of them) turned into ч (what is that code?). Also file is saved as ansi, not utf-8. So the question is - how to properly create well formatted and readable (with cyrillic symbols) XML document?
Prefix XML with appropriate encoding header and tags1.
First line in XML should be:
<?xml version="1.0" encoding="UTF-8"?>
Found better solution using DOMDocument. Heres rewrited example of code inside the loop:
$node_ad = $xml->CreateElement('ad');
$node_ads->appendChild($node_ad);
//$node_ads->addChild('title', htmlentities(htmlspecialchars(substr($resources['titles'][$i], 0, 70))));
$title = $xml->CreateElement('title', htmlentities(htmlspecialchars(substr($resources['titles'][$i], 0, 70))));
$node_ad->appendChild($title);
$text = $xml->CreateElement('text', 'Текст текст');
$node_ad->appendChild($text);
$images_node = $xml->CreateElement('images');
$node_ad->appendChild($images_node);
$images = $xml->CreateElement('image', $this->_mainUrl.'/uploads/'.$resources['images'][$i]);
$images_node->appendChild($images);
When I run
$result = exec("curl someURL");
I got results that contains \u0003 etc.. chars:
"d":"\u003c?xml version=\"1.0\"?\u003e\u003cArrayOfAnnoncePresentation
xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"
xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\"\u003e\u003cPageCount\u003e\u003cPageCount\u003e25\u003c/PageCount\u003e\u003c/PageCount\u003e\u003cAnnoncePresentation\u003e\u003cTitre\u003e3garages\u003c/Titre\u003e\u003cLienDetail\u003e/
How can I decode that ?
I am trying to read a .tsv file using PHP. I am using the simplest method of file_get_contents() but it is skipping any text between <> tags.
Following is the format of my .tsv file
<id_svyx35_88c_avbfa5> <Kuldeep_Raval> rdf:type <wikicat_Delhi_Daredevils_cricketers>
Following is the code I am using
$filename = "access_s.tsv";
$content = file_get_contents($filename);
//Split file into lines
$lines = explode("\n", $content);
echo $content;
On reading it, the output is just
rdf:type
Please help in what can be the solution to read the line as it is?
Try to apply htmlspecialchars() to $content:
$filename = "access_s.tsv";
$content = htmlspecialchars(file_get_contents($filename));
//Split file into lines
$lines = explode("\n", $content);
echo $content;
Reference on php.net
The tags have always been there, the browser just does not show them. Just like with any valid HTML tag, you can see them when viewing the source code of the website.
I want to add triple spacing in my xml but browser is changing triple spacing to single. I have found to be used to include spacing. I have a tab delimited text file and i am converting it to xml using php. In order to have triple spacing inside my title node i am doing like this.
$xml->startElement('Products');
while ($line = fgetcsv($fp, 0, "\t")) {
$xml->startElement('Product');
//replacing titlesingle space to triple space
$title = str_replace(" ", " ", $line[1]);
$xml->writeElement('title', $title);
.....
$xml->endElement();
}
$xml->endElement();
}
But is changed to   and title is something like this
<title>A   Test</title>
Basically php is changing & to amp but i want exact so that i can have triple spacing in the title
<title>A Test</title>
Any help??
It does not change the spacing. But in HTML/XML several whitespaces are usually (not always) rendered as a single space or a linebreak.
#160; is the non breaking space. One of the uses is to separate the parts of a phone number. You don't want one of the spaces in a phone number rendered as a linebreak.
The UTF-8 bytes for this character are \xC2\xA0.
$xml->writeElement('foo', "foo\xC2\xA0&\xC2\xA0bar");
If the XML document is encoding is ASCII, they will get encoded as entities.
$xml = new XMLWriter();
$xml->openMemory();
$xml->startDocument('1.0', 'ASCII');
$xml->writeElement('foo', "foo\xC2\xA0&\xC2\xA0bar");
$xml->endDocument();
echo $xml->outputMemory(TRUE);
Output:
<?xml version="1.0" encoding="ASCII"?>
<foo>foo & bar</foo>
I have to parse externally provided XML that has attributes with line breaks in them. Using SimpleXML, the line breaks seem to be lost. According to another stackoverflow question, line breaks should be valid (even though far less than ideal!) for XML.
Why are they lost? [edit] And how can I preserve them? [/edit]
Here is a demo file script (note that when the line breaks are not in an attribute they are preserved).
PHP File with embedded XML
$xml = <<<XML
<?xml version="1.0" encoding="utf-8"?>
<Rows>
<data Title='Data Title' Remarks='First line of the row.
Followed by the second line.
Even a third!' />
<data Title='Full Title' Remarks='None really'>First line of the row.
Followed by the second line.
Even a third!</data>
</Rows>
XML;
$xml = new SimpleXMLElement( $xml );
print '<pre>'; print_r($xml); print '</pre>';
Output from print_r
SimpleXMLElement Object
(
[data] => Array
(
[0] => SimpleXMLElement Object
(
[#attributes] => Array
(
[Title] => Data Title
[Remarks] => First line of the row. Followed by the second line. Even a third!
)
)
[1] => First line of the row.
Followed by the second line.
Even a third!
)
)
Using SimpleXML, the line breaks seem to be lost.
Yes, that is expected... in fact it is required of any conformant XML parser that newlines in attribute values represent simple spaces. See attribute value normalisation in the XML spec.
If there was supposed to be a real newline character in the attribute value, the XML should have included a
character reference instead of a raw newline.
The entity for a new line is
. I played with your code until I found something that did the trick. It's not very elegant, I warn you:
//First remove any indentations:
$xml = str_replace(" ","", $xml);
$xml = str_replace("\t","", $xml);
//Next replace unify all new-lines into unix LF:
$xml = str_replace("\r","\n", $xml);
$xml = str_replace("\n\n","\n", $xml);
//Next replace all new lines with the unicode:
$xml = str_replace("\n","
", $xml);
Finally, replace any new line entities between >< with a new line:
$xml = str_replace(">
<",">\n<", $xml);
The assumption, based on your example, is that any new lines that occur inside a node or attribute will have more text on the next line, not a < to open a new element.
This of course would fail if your next line had some text that was wrapped in a line-level element.
Assuming $xmlData is your XML string before it is sent to the parser, this should replace all newlines in attributes with the correct entity. I had the issue with XML coming from SQL Server.
$parts = explode("<", $xmlData); //split over <
array_shift($parts); //remove the blank array element
$newParts = array(); //create array for storing new parts
foreach($parts as $p)
{
list($attr,$other) = explode(">", $p, 2); //get attribute data into $attr
$attr = str_replace("\r\n", "
", $attr); //do the replacement
$newParts[] = $attr.">".$other; // put parts back together
}
$xmlData = "<".implode("<", $newParts); // put parts back together prefixing with <
Probably can be done more simply with a regex, but that's not a strong point for me.
Here is code to replace the new lines with the appropriate character reference in that particular XML fragment. Run this code prior to parsing.
$replaceFunction = function ($matches) {
return str_replace("\n", "
", $matches[0]);
};
$xml = preg_replace_callback(
"/<data Title='[^']+' Remarks='[^']+'/i",
$replaceFunction, $xml);
This is what worked for me:
First, get the xml as a string:
$xml = file_get_contents($urlXml);
Then do the replacement:
$xml = str_replace(".\xe2\x80\xa9<as:eol/>",".\n\n<as:eol/>",$xml);
The "." and "< as:eol/ >" were there because I needed to add breaks in that case. The new lines "\n" can be replaced with whatever you like.
After replacing, just load the xml-string as a SimpleXMLElement object:
$xmlo = new SimpleXMLElement( $xml );
Et Voilà
Well, this question is old but like me, someone might come to this page eventually.
I had slightly different approach and I think the most elegant out of these mentioned.
Inside the xml, you put some unique word which you will use for new line.
Change xml to
<data Title='Data Title' Remarks='First line of the row. \n
Followed by the second line. \n
Even a third!' />
And then when you get path to desired node in SimpleXML in string output write something like this:
$findme = '\n';
$pos = strpos($output, $findme);
if($pos!=0)
{
$output = str_replace("\n","<br/>",$output);
It doesn't have to be '\n, it can be any unique char.