I am picking up some legacy code from another developer. It's a middleman script that passes XML data between systems via curl. All of a sudden, the XML that it's returning contains strange characters between everything, rendering it invalid XML:
If I bypass the PHP script in question and post the data directly to the other system, it returns valid XML, so it seems to be a problem with the PHP script.
Below is the curl code:
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $xml->asXML());
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER,
array(
'Content-Type: text/plain',
'Authorization: Basic ' . $token)
);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
curl_close($ch);
Echoing $result is what contains the invalid XML.
Again, if I post directly to $url via Postman, I get the correct and valid XML.
I tried changing the Content-Type to application/xml, but that didn't help.
Is there an encoding issue that has perhaps been introduced by a server or PHP update?
Thank you
What you see is a raw UTF-16 stream. Your application is most likely using UTF-8.
I'm sure the server will not change the output encoding per your request and I'm not aware of any means to handle it right from Curl, so you'll need to do the conversion yourself:
var_dump(mb_convert_encoding($result, 'UTF-8', 'UTF-16'));
However, it's possible that your XML processor will handle it nicely out of the box. SimpleXML certainly does.
if I set the encoding to UTF-16 via curl_setopt($ch, CURLOPT_ENCODING, 'UTF-16'); it still has those characters
That sets the Accept-Encoding request header and also instructs Curl to decode the response. This is meant to handle compressed data, it's unrelated to text encoding (plus you're requesting UTF-16, precisely the encoding you don't accept).
Related
simple_html_dom does not take data from some websites.
For the website www.google.pl, it downloads the source of the page,
but for other such as: gearbest.com, stooq.pl does not download any data.
require('simple_html_dom.php');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://www.google.com/"); // work
/*
curl_setopt($ch, CURLOPT_URL, "https://www.gearbest.com/"); // dont work
curl_setopt($ch, CURLOPT_URL, "https://stooq.pl/"); // dont work
*/
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
curl_close($ch);
$html = new simple_html_dom();
$html->load($response);
echo $html;
What should I change in the code to receive data from websites?
The root problem here (at least on my computer, maybe different with
your version...) is that site returns gzipped data, and it isn't being
uncompressed properly by php and curl before being passed to the dom
parser. If you are using php 5.4, you can use gzdecode and
file_get_contents to uncompress it yourself.
<?php
// download the site
$data = file_get_contents("http://www.tsetmc.com/loader.aspx?ParTree=151311&i=49776615757150035");
// decompress it (a bit hacky to strip off the gzip header)
$data = gzinflate(substr($data, 10, -8));
include("simple_html_dom.php");
// parse and use
$html = str_get_html($data);
echo $html->root->innertext();
Note that this hack will not work on most sites. The main reason
underlying this seems to me that curl doesn't announce that it accepts
gzip data... but the web server on that domain doesn't pay attention
to that header, and gzips it anyway. Then neither curl nor php
actually checks the Content-Encoding header on the response, and
assumes it isn't gzipped so it passes it through without an error nor
calling gunzip. Bugs in both the server and the client here!
For a more robust solution, maybe you can use curl to get the headers
and inspect them yourself to determine if you need to decompress it.
Or you can just use this hack for this site and the normal method for
others to keep things simple.
It might still also help to set the character encoding on your output.
Add this before you echo anything to ensure the data you read isn't
recorrupted in the user's browser by being read as the wrong charset:
header('Content-Type: text/html; charset=utf-8');
I'm new to XML, and usually use JSON to pass data. I am working with a new system, and this was part of their instructions to me about passing data:
The XML content then can be sent as either PAYLOAD on the stream or as an additional parameter. If the latter is done, the parameter name is RequestXML
I'm not sure what this means? I'm afraid if I pass it as a parameter, but I have a lot of text, it will make the URL too long, so I'd like to do the PAYLOAD option. I'm using PHP and Jquery to generate the array. I can create an XML file using PHP and have it properly formatted as XML, but sending it across is confusing me.
What do I need to do to get it sent as a PAYLOAD?
You'll likely just want to send a POST HTTP request. Here's an example using the curl library:
<?php
$url = "https://example.com/service";
$xml = "<foo />";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: text/xml'));
curl_setopt($ch, CURLOPT_POSTFIELDS, $xml);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
curl_close($ch);
?>
I have been given an example piece of code from a company I'm dealing with for how to post XML data to a URL then read the response. Unfortunately for me this in VBS which I don't have a good working knowledge of:
This is the section of code that I'm interested in. This should pass over the XML file that was read in to oXML then post it and read the response:
set oHTTP = CreateObject("Microsoft.XMLHTTP")
oHTTP.open "POST", "http://www.ophub.net/opxml/response.asp", false,00092,QW 'file url - with dealers Account number, Password
oHTTP.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
oHTTP.setRequestHeader "Content-Length", Len(sRequest)
oHTTP.send oXML
From what I understand of this in PHP this can be done with cUrl and I have come up with the following from bits that I have read online but this doesn't work and I'm not sure why.
$ch = curl_init();
curl_setopt($ch, CURLOPT_HTTPHEADER, Array("Content-Type: application/x-www-form- urlencoded"));
curl_setopt($ch, CURLOPT_URL, "http://www.ophub.net/opxml/response.asp");
curl_setopt($ch, CURLOPT_USERPWD, "00092:QW");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, "XML=" . $xml);
$content=curl_exec($ch);
echo $content;
I'm sure I can't be far off what I need but I can't seem to get there so any hep would be very much appreciated.
In your post fields just set the xml no need to set xml=
curl_setopt($ch, CURLOPT_POSTFIELDS, $xml);
you also need return transfer true
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
It looks to me like that is precisely what you need. It seems to be an exact translation.
The only two potential issues that I can see are:
You PHP code seems to have a space in the middle of the Content-Type header value, this will most likely break things
You will, if you haven't already done so, need to URL encode the $xml data before sending it as part of a application/x-www-form-urlencoded message (clue's in the type name :-P), which you can do with urlencode().
It might be a good idea to build the body data as a string and echo it out, to ensure that the data is correct:
echo 'XML=' . urlencode($xml);
For the record wrapping XML messages in application/x-www-form-urlencoded is a horrible way to do things. But I've come across more than one API that does it, so I'm going to assume that your code is correct in this regard for the API you are trying to consume.
I've been trying to perform an XML request. I've faced so many problems that I managed to solve. But this one I couldn't solve.
this is the script:
$url ="WebServiceUrl";
$xml="XmlRequest";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_MUTE, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: text/xml'));
curl_setopt($ch, CURLOPT_POSTFIELDS, "$xml");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
curl_close($ch);
echo $output;
It is giving me this error:
System.InvalidOperationException: Request format is invalid: text/xml. at System.Web.Services.Protocols.HttpServerProtocol.ReadParameters() at System.Web.Services.Protocols.WebServiceHandler.CoreProcessRequest()
I'm still a noob at this. So go easy on me:)
thanks.
Looks like you're sending stuff as text/xml, which is not what it wants. Find the docs for this web service e.g. WSDL stuff if it's there, and find out what data formats it accepts.
Be sure e.g. that it's not really saying it will respond in XML, after receiving a request as standard HTML POST variables.
There are two main content types used with the HTTP POST method: application/x-www-form-urlencoded and multipart/form-data.
The content-type determines what the format of the CURLOPT_POSTFIELDS should be. If you are using the default, which is "application/x-www-form-urlencoded" you probably want to use build_http_query() to construct the url encoded query string.
If you are sending non-ASCII data you canpass an associative array with keys that match the field names and values that correspond to the value for the field. Using this technique will cause the request to be issued with a multipart/formdata content-type.
At this point, it sounds like your next step should be figuring out what fields the API is expecting.
application/x-www-form-urlencoded or multipart/form-data?
I want to use simeplexml class in PHP5 to handle a small XML file. But to obtain that file, script has to send a specific POST request to a remote server that will "give" me an XML file in return. So I believe I can't use the "simplexml_load_file" method. This file is needed just for processing, then it can, or even should, be gone/deleted.
I've got HTTP HEADER of this type
$header = 'POST '.$gateway.' HTTP/1.0'."\r\n" .
'Host: '.$server."\r\n".
'Content-Type: application/x-www-form-urlencoded'."\r\n".
'Content-Length: '.strlen($param)."\r\n".
'Connection: close'."\r\n\r\n";
And not much idea of what to do next with that. There is fsockopen but I'm not sure if that would be appropriate or how to go with it.
My advice would be use something like Zend_Http_Client library or cURL. Getting everything right with fsockopen will be a pain to debug.
Zend_Http_Client has a nice interface and would work fabulously.
CURL isn't too much of a pain either and is already a part of most PHP builds.
Example below:
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/"); // Replace with your URL
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch) // Return the XML string of data
// Parse output to Simple XML
// You'll probably want to do some validation here to validate that the returned output is XML
$xml = simplexml_load_string($output);
I'd use an HTTP client library like Zend_Http_Client (or cURL if you're a masochist) to create the POST request, then feed the response body into simplexml_load_string or SimpleXMLElement::__construct()