PHP header for xml with utf-8 - php

I'm using the header() function to turn a file into XML standard.
The problem is when I use <?php header("Content-type: text/xml; charset=utf-8"); ?> it just renders the <?xml version="1.0"?>, without the enconding/charset. Am I using it wrongly?

The header() function just modifies HTTP headers. The code you posted sets a Content-Type header, which is important for telling browsers and other clients what type of file you're serving.
The <?xml version="1.0"?> line you're talking about is part of the document itself, and not affected by the HTTP headers.
Your tags say you're using DOM to create your XML document. If you change your DomDocument constructor to also pass the charset,
$doc = new DomDocument('1.0', 'UTF-8');
It should output that charset in the XML document.

header just sets a HTTP header in the result. PHP doesn't do anything else with the value, so it's up to you to make sure it's being used properly.
If you're using an XML library to generate your XML (including the prologue), check the documentation for that library. If you're outputting the XML "by hand" (o, you need to add the necessary attribute to the prologue yourself.

You can also use
<?php echo '<?xml version"1.0" encoding="UTF-8"?>'; ?>

Related

Load, Read XML file but save become file text with spacing

I have just load the xml file and save it immediately but then it become text file.
My XML structure is simple here:
<?xml version="1.0" encoding="UTF-8"?>
<ContentInfo>
<Content>
<PlayerID>P1</PlayerID>
<TVID>TV1</TVID>
<TVStatus>0</TVStatus>
</Content>
</ContentInfo>
Here is my php code:
<script type="text/javascript" src="jquery-2.1.4.js"></script>
<?php
ob_start();
$xml = new DOMDocument('1.0');
$xml->encoding = 'UTF-8';
$xml->formatOutput = true;
$xml->preserveWhiteSpace = false;
$xml->load('TV_Status.xml');
htmlentities($xml->save('TV_Status.xml'));
header("Refresh: 120;url='index.php'");
?>
It automatically changes to text files although it is still ending with xml file and still can be edit.
XML files are text files. There should be no problem: https://en.wikipedia.org/wiki/XML
I just tried it and the new written file looks almost exactly the same as the source file.
XML is text.
You should remove ob_start();. If you mean that the format is not like you want it to be then you should play around with this part
$xml->formatOutput = true;
$xml->preserveWhiteSpace = false;
a bit.
XML files are text files. Text files with a specific syntax that an application can parse. By default if you send data to the browser, PHP will send an additional information (HTTP header) that the data is HTML. If a browser receives data as HTML it will try to parse and render it. XML syntax uses the a comparable syntax, so the XML tags get parsed as HTML, but ignored as unknown HTML tags. The browser displays the text content of the XML file.
You can change that behavior be sending different content type headers:
Send as XML:
header("Content-Type: application/xml");
readfile('TV_Status.xml');
Send as text:
header("Content-Type: text/plain");
readfile('TV_Status.xml');
You don't need to load and parse an XML file for this, you can just read and pass it to the browser using readfile().

How to get the html source of a page correct?

I use this code to get an HTML source:
<?php
header('Content-Type: text/html; charset=utf-8');
$html = file_get_html("http://www.google.com/");
echo $html;
But when I want to get the source from here I don't correct response and I get something like these characters:
���moY�&�9����i�[S$%ٲ�9������l�l/���F"H�H�VDPJ����˲59��[��v���R�Vɖ3KY��_A����_� ��so�1�N��T�E"#nܸ��s��=� ��������?�?������� ���|������0Vk��Z�2o��E۪ ү�XF�ny���;v�R�ܦ���F�Ƨe˷ ��g����{�������}
The content from Google by default uses some sort of HTTP compression. Two commonly used compression schemas are gzip and deflate. Read more about it here:
http://en.m.wikipedia.org/wiki/HTTP_compression

SimpleXML addChild issue when header sent

I have this code :
$xml = new SimpleXMLElement('<myxml></myxml>');
$xml->addChild('testNode attr="test Attribute"');
$node = $xml->addChild('erroNode attr="My Child node causes error -> expect >"');
//$node->addChild('nodeChild attr="node Child"');
header('Content-type: text/xml');
echo $xml->asXML();
exit();
I can create a childnode with attributes via $xml, but not with $node(child's child), Why? i get the error error on line 2 at column 66: expected '>'
From the docs it say that the addChild function returns a SimpleXmlElement of the child.
Check by uncommenting the commented line $node->addChild('nodeChild attr="node Child"');
Also it only happens when header is sent, if i comment header and do like below i can see the correct xml in page source :
$xml = new SimpleXMLElement('<myxml></myxml>');
$xml->addChild('testNode attr="test Attribute"');
$node = $xml->addChild('erroNode attr="My Child node causes error -> expect >"');
$node->addChild('nodeChild attr="node Child"');
//header('Content-type: text/xml');
echo $xml->asXML();
exit();
My PHP version is 5.4.9
The error you are seeing is not coming from SimpleXML, but from your browser - that's why changing the HTTP header works. With this line, the browser knows the page is XML, and checks that it's valid; without it, it assumes it's HTML, and is more lenient:
header('Content-type: text/xml');
If you use "View Source" in your browser, you'll find that the actual output from PHP is the same in both cases. Another nice test is to set the content-type to text/plain instead, which means the browser won't interpret the output at all, just show it as-is.
So, for some reason, SimpleXML is generating invalid XML. This is because the ->addChild() method takes as its first argument just the name of the element to add, in your case 'erroNode'; you are passing in an invalid name that also includes attributes, which should be added later with ->addAttribute().
If we simplify the example a bit further, and look at the XML generated, we can see what's going on (here's an online demo):
// Make browser show plain output
header('Content-type: text/plain');
// Working example
$xml = new SimpleXMLElement('<myxml></myxml>');
$xml->addChild('testNode attr="test Attribute"');
echo $xml->asXML();
echo "\n";
// Broken example
$xml = new SimpleXMLElement('<myxml></myxml>');
$node = $xml->addChild('testNode attr="test Attribute"');
$node->addChild('test');
echo $xml->asXML();Child('testNode attr="test Attribute"');
$node->addChild('test');
echo $xml->asXML();
This outputs the below:
<?xml version="1.0"?>
<myxml><testNode attr="test Attribute"/></myxml>
<?xml version="1.0"?>
<myxml><testNode attr="test Attribute"><test/></testNode attr="test Attribute"></myxml>
The first version of the XML appears to be doing the right thing, because it has created a "self-closing tag". However, in the second, you can see that SimpleXML thinks that the tag name is 'testNode attr="test Attribute"', not just 'testNode', because that's what we told it.
The result is that it tries to put a closing tag with that "name", and ends up with </testNode attr="test Attribute">, which isn't valid XML.
Arguably, SimpleXML should protect you against this kind of thing, but now that you know, you can easily fix the code (demo):
// Make browser show plain output
header('Content-type: text/plain');
// Fixed example
$xml = new SimpleXMLElement('<myxml></myxml>');
$node = $xml->addChild('testNode');
$node->addAttribute('attr', 'test Attribute');
$node->addChild('test');
echo $xml->asXML();
Now, SimpleXML knows that the tag is just called 'testNode', so can create the correct closing tag when it needs to:
<?xml version="1.0"?>
<myxml><testNode attr="test Attribute"><test/></testNode></myxml>

Does an RSS feed have to be an XML file?

I ask this because I see only two xml files in a wordpress blog, wlwmanifest.xml and default.xml and neither look like an rss feed. However I do see a php file called feed-rss2.php that looks like an rss feed. Everything I've ever read says that rss feeds have to be xml files. Am I wrong? Can they be php files with xml code inside?
They are just outputting XML code with XML headers. The actual file doesn't have to be an XML file, just the response has to be text/xml and contain XML output. You can do the same for things like CSS files... anything really.
There is no such thing as a file extension in HTTP.
A client requests a URI from a server. The server responds with a Content-Type HTTP header that says what type of file it is sending back, and the file in the HTTP body.
The client doesn't care (and can't know) if the server generated that response by reading a static file, by running a program, or by some other means.
There is no difference to the client between a PHP program that outputs XML and a static XML file.
The RSS Readers would be looking only on the client side, not the server side. In the wordpress installations, the feed-rss2.php is a PHP File, processed by the server, by giving the correct headers in the format:
header("Content-type: text/xml");
So that the readers get to know that it is a XML file and not a PHP file. As Robbo said, the actual file doesn't have to be an XML file, just the response has to be text/xml and contain XML output.
Even the same case with the styles. If you see the wordpress's style.php, it would have something like:
header("Content-type: text/css");
include($theme . "/style.css");
So that, it uses PHP's power to read the appropriate file and display the output in the same URL. Easy isn't it?

XML generation error

I'm trying to generate an XML output with Zend_Framework, but this nasty thing keeps popping up:
XML Parsing Error: XML or text declaration not at start of entity
Location: http://cart/index/kurpirkt
Line Number 2, Column 1:<?xml version="1.0" encoding="utf-8"?>
^
As far as I know there are no white-spaces in any of my include files, and even if there were, I think that the ob_clean() function should have taken care of it. Here is my code:
public function kurpirktAction()
{
ob_clean();
// XML-related routine
$xml = new DOMDocument('1.0', 'utf-8');
$xml->appendChild($xml->createElement('foo', 'bar'));
$output = $xml->saveXML();
// Both layout and view renderer should be disabled
Zend_Controller_Action_HelperBroker::getStaticHelper('viewRenderer')->setNoRender(true);
Zend_Layout::getMvcInstance()->disableLayout();
// Setting up headers and body
$this->_response->setHeader('Content-Type', 'text/xml; charset=utf-8')
->setBody($output);
}
Any help or suggestions?
First test, if the additional whitespace occures in all actions of your application.
If so, check
/public/index.php and
/application/bootstrap.php
for trailing spaces before <?php or old left-over debug statements.
Edit: transfered the helpful information from the comments to the answer

Categories