I have some problems with showing and formatting my JSON code.
I'm scraping Aliexpress product page, with Laravel goutte, and I have extracted JSON that is provided by aliexpress in their source code (You can check it it starts with window.runParams at the end of source code).
So after I successfully extracted that data, I have problems with formatting JSON. As you see Aliexpress already returned that data in JSON so I don't need to write json_encode or decode. I'm returning my response in postman and I don't think that postman is making that JSON response wrong, when I open JSON formatter it's always giving me errors for multiple lines "Invalid character".
This is my full code, and I'm just calling this class and returning it in controller:
use Goutte\Client;
use Symfony\Component\HttpClient\HttpClient;
class Scrapers
{
// This will be sent from another class, for now we are calling it from here
private $client;
public $url;
public $array = [];
public function __construct($url){
$this->url = $url;
$this->client = new Client(HttpClient::create(['timeout' => 60]));
$this->client->request('GET', $url)->filter('script')->each(
function ($node) {
array_push($this->array, $node->html());
});
}
public function getBetween($content,$start,$end){
$r = explode($start, $content);
if (isset($r[1])){
$r = explode($end, $r[1]);
return $r[0];
}
return '';
}
/**
* Return product title
*/
public function getTitle(){
$content = $this->array[16];
$start = "data:";
$end = "csrfToken:";
$output = $this->getBetween($content,$start,$end);
$remove_new_lines = str_replace(array("\n", "\r"), '', $output);
return $remove_new_lines;
}
For now I don't have extracted ',' from the end of the file, but if I delete it, formatter must return valid json response anyway (but it doesen't).
This is how is data returned in my postman: https://justpaste.it/3qrtm
I tried multiple functions that are removing tabs, new spaces and everything I found on internet, but no success.
Any ideas how to fix that?
Try $node->text()instead of $node->html()
Related
I'm using Cakephp with json parse extension and the RequestHandler component in order to create Web services using json.
I created a controller named Ws
In this controller I have a named userSubscribe
In order to avoid a lot of If else statements in the next methods, I thought about using a private function inside this controller that will check somes conditions and stop the script normaly BUT ALSO render the json normaly. I just want to do a DRY way !
My question is :
How could I render the json view in a sub function (called by the userSubscribe) ?
To make it clear, here is the style code that would like
public function userSubscribe() {
$this->check();
// Following code only executed if check didn't render the json view
// $data = ...
$code = 1;
$i = 2;
}
private function check() {
$input = &$this->request->data;
if ($_SERVER["CONTENT_TYPE"] != "application/json") { // For example
$result = "KO";
$this->set(compact("result"));
$this->set('_serialize', 'result');
$this->render(); // HERE, it will stop the 'normal behaviour' and render the json with _serialize
}
if (!isset($input["input"])) {
$result = "KO";
$this->set(compact("result"));
$this->set('_serialize', 'result');
$this->render(); // HERE, it will stop the 'normal behaviour' and render the json with _serialize
}
}
It's seems to be quite simple to do, but why can't I find the answer ?!
Thanks in advance for clue/advise/anything !
I am writing unit tests for several methods which return HTTP response codes. I cannot find a way to assert an HTTP response code. Perhaps I am missing something obvious, or I am misunderstanding something about PHPUnit.
I am using PHPUnit 4.5 stable.
Relevant part of class Message:
public function validate() {
// Decode JSON to array.
if (!$json = json_decode($this->read(), TRUE)) {
return http_response_code(415);
}
return $json;
}
// Abstracted file_get_contents a bit to facilitate unit testing.
public $_file_input = 'php://input';
public function read() {
return file_get_contents($this->_file_input);
}
Unit test:
// Load invalid JSON file and verify that validate() fails.
public function testValidateWhenInvalid() {
$stub1 = $this->getMockForAbstractClass('Message');
$path = __DIR__ . '/testDataMalformed.json';
$stub1->_file_input = $path;
$result = $stub1->validate();
// At this point, we have decoded the JSON file inside validate() and have expected it to fail.
// Validate that the return value from HTTP 415.
$this->assertEquals('415', $result);
}
PHPUnit returns:
1) MessageTest::testValidateWhenInvalid
Failed asserting that 'true' matches expected '415'.
I'm unsure why $result is returning 'true' . . . especially as a string value. Also unsure what my 'expected' argument ought to be.
According to the docs you can call the http_response_code() method with no parameters to receive the current response code.
<?php
http_response_code(401);
echo http_response_code(); //Output: 401
?>
Therefore your test should look like:
public function testValidateWhenInvalid() {
$stub1 = $this->getMockForAbstractClass('Message');
$path = __DIR__ . '/testDataMalformed.json';
$stub1->_file_input = $path;
$result = $stub1->validate();
// At this point, we have decoded the JSON file inside validate() and have expected it to fail.
// Validate that the return value from HTTP 415.
$this->assertEquals(415, http_response_code()); //Note you will get an int for the return value, not a string
}
I have a website that contains a form that makes various SOAP requests at certain points. One of these requests gets a list of induction times returned and displays them to the user in order for them to pick one.
I am getting results returned fine from the SOAP service but unfortunately it seems to be not showing information correctly and even not displaying returned object keys at all.
I have liased with one of the devs at the SOAP end and he says the service is fine and spitting out the cirrect information. He has provided a screentshot:
Here is my code to pull call the method I need for this information:
public function getInductionTimes($options) {
$client = $this->createSoapRequest();
$inductionTimes = $client->FITinductionlist($options);
//die(print_r($inductionTimes));
return $inductionTimes;
}
private function createSoapRequest() {
$url = 'https://fitspace.m-cloudapps.com:444/FITSPACE/MHservice.asmx?WSDL';
$options["connection_timeout"] = 25;
$options["location"] = $url;
$options['trace'] = 1;
$options['style'] = SOAP_RPC;
$options['use'] = SOAP_ENCODED;
$client = new SoapClient($url, $options);
//die(print_R($client->__getFunctions()));
return $client;
}
As you can see I print_r the code right after I have received it to check what I am getting returned and it is this:
As you can see this IDdtstring field is getting completely ignored.
Does anyone have any ideas as to why this may be happening? Is it something to do with encoding? I can't seem to get anywhere on this issue!
Thanks
I was able to retrieve the fields correctly, including IDdtstring, using your basic code. Perhaps you are not sending the parameters correctly?
function getInductionTimes($options) {
$client = createSoapRequest();
$inductionTimes = $client->FITinductionlist($options);
die(print_r($inductionTimes));
return $inductionTimes;
}
function createSoapRequest() {
$url = 'https://fitspace.m-cloudapps.com:444/FITSPACE/MHservice.asmx?WSDL';
$options["connection_timeout"] = 25;
$options["location"] = $url;
$options['trace'] = 1;
$options['style'] = SOAP_RPC;
$options['use'] = SOAP_ENCODED;
$client = new SoapClient($url, $options);
//die(print_R($client->__getFunctions()));
return $client;
}
getInductionTimes(array("IDDate" => "2013-06-28T13:00:00+01:00", "GYMNAME" => "Bournemouth"));
I managed to solve this issue by adding the line of code into my SOAP options array which I then presume was an issue with my WSDL being cached in PHP:
$options['cache_wsdl'] = WSDL_CACHE_NONE;
Im trying to use the GiantBomb api to query video games, and currently when I enter the URL into a browser, it works just fine. The Json data shows up.
Heres an example url..
http://www.giantbomb.com/api/search/?api_key=83611ac10d0dfghfgh157177ecb92b0a5a2350c59a5de4&query=Mortal+Kombat&format=json
But when I try to use my php wrapper that Im just starting to build, it returns html??
Heres the start of my wrapper code....(very amateur for now)
You'll notice in the 'request' method, Ive commented out the return for json_decode($url), because when I uncomment it, the page throws a 500 error??? So I wanted to see what happends when I just echo it. And it echos an html page. Surely it should just echo what is shown, when you just enter that url into the browser, no?
However...if I replace the url with say a GoogleMap url, it echoes out Json data just fine, without using json_decode. Any ideas as to wahts going on here????
class GiantBombApi {
public $api_key;
public $base_url;
public $format;
function __construct() {
$this->format="&format=json";
$this->api_key = "83611ac10d0d157177ecb92b0a5a2350c59a5de4";
$this->search_url = "http://www.giantbomb.com/api/search/?api_key=".$this- >api_key."&query=";
}
public function search($query){
$query = urlencode($query);
$url = $this->search_url.$query.$this->format;
return $this->request($url);
}
public function request($url) {
$response = file_get_contents($url);
echo $response;
//return json_decode($response, true);
}
}
//TESTING SECTION
$games = new GiantBombApi;
$query = $_GET['search'];
echo $games->search($query);
I ran a few requests through Postman and it seems that the api looks at the mime-type as well as the query string. So try setting a header of "format" to "json".
I am creating a metasearch engine using Yandex API. Yandex gives result in XML format. So we need to traverse the XML response inorder to get the different fields like URL,title ,description etc.
The XML response by Yandex is as follows:
http://pastebin.com/kAVAVri9
This is how i have implemented: paste
$dom5 = new DOMDocument();
if ($dom5->loadXML($site_results)) {
$results = $dom5->getElementsByTagName("response");
$results1 = $results->getElementsByTagName("results");
$results2 = $results1->getElementsByTagName("group");
$totals["yandex"] = 1000;
foreach ($results1 as $link) {
$url = $link->getElementsByTagName("doc")->item(2)->nodeValue;
;
$url = str_replace('http://', '', $url);
if (substr($url, -1, 1) == '/') {
$url = substr($url, 0, strlen($url) - 1);
}
$search_results[$i]["url"] = $url;
$title = $link->getElementsByTagName("doc")->item(4)->nodeValue;
$search_results[$i]["title"] = $title;
$test = $link->getElementsByTagName("doc");
$test1 = $test->getElementsByTagName("title");
$desc = $test1->getElementsByTagName("headline")->item(0)->nodeValue;
$search_results[$i]["desc"] = $desc;
$search_results[$i]["engine"] = 'yandex';
$search_results[$i]["position"] = $i + 1;
$i++;
}
}
I am new to php. Please forgive me if i have done some stupid mistake. I am unable to retrive the results through my implementation. Please help me find the mistake and get the necessary fields from xml response.
Thank you!
The method getElementsByTagName() returns a DOMNodeList:
$results = $dom5->getElementsByTagName("response");
The DOMNodeList does not have a method called getElementsByTagName(), but you call it:
$results1 = $results->getElementsByTagName("results");
Therefore the fatal error is triggered: Whenever in PHP you execute a method on an object that does not exist, you will get a fatal error and your script stops working.
Do not call undefined object methods and you should be fine.
Apart from these basics, for parsing such XML documents I normally suggest SimpleXML, however this XML file is a little specific therfore I suggest to extend from SimpleXML and add the features you likely need to use, in part from regular expressions as well as from DOMDocument.
One concept you should know about when parsing these XML files is Xpath. For example to access the elements you had that many problems with above, you can write the path literally:
/*/response/results/grouping/group
In PHP with SimpleXML this looks like:
$url = 'http://pastebin.com/raw.php?i=kAVAVri9';
$xml = simplexml_load_file($url, 'MySimpleXML');
foreach ($xml->xpath('/*/response/results/grouping/group') as $link) {
# ... operate on $link
}
A larger example:
$url = 'http://pastebin.com/raw.php?i=kAVAVri9';
$url = '../data/yandex.xml';
$xml = simplexml_load_file($url, 'MySimpleXML');
foreach ($xml->xpath('/*/response/results/grouping/group') as $link) {
$url = $link->doc->url->str()->preg('~^https?://(.*?)/*$~u', '$1');
$title = $link->doc->title->text();
$headline = $link->doc->headline->text();
printf("<%s> %s\n%s\n\n", $url, $title, wordwrap($headline));
}
And it's exemplary output:
<www.facebook.com> " Facebook" - a social networking service
Allows users to find and communicate with friends, classmates and
colleagues, share thoughts, photos and videos, and join various groups.
<en.wikipedia.org/wiki/Facebook> Facebook - Wikipedia, the free encyclopedia
Facebook is a social networking service launched in February 2004, owned
and operated by Facebook, Inc. As of September 2012, Facebook has over one
billion active users, more than half of them using Facebook on a mobile
device.
<mashable.com/category/facebook> Facebook
...
The PHP code example above needs some more code to work because it extends from SimpleXML for the ease of use. This is done with the following code:
class MySimpleXML extends SimpleXMLElement
{
public function text()
{
$string = null === $this[0] ? ''
: (dom_import_simplexml($this)->textContent);
return $this->str($string)->normlaizeWS();
}
public function str($string = null)
{
return new MyString($string ?: $this);
}
}
class MyString
{
private $string;
public function __construct($string)
{
$this->string = $string;
}
public function preg($pattern, $replacement)
{
return new self(preg_replace($pattern, $replacement, $this));
}
public function normlaizeWS()
{
return $this->preg('~\s+~', ' ');
}
public function __toString()
{
return (string) $this->string;
}
}
This might be all a little bit much for the beginning, checkout the PHP manual for SimpleXML and the other functions used in the code-example.