I have an array of ids called $enabled as follows:
Array
(
[0] => 374468
[1] => 406927
[2] => 177652
[3] => 488233
[4] => 167711
[5] => 166049
[6] => 492313
[7] => 166499
[8] => 489315
[9] => 396740
[10] => 167639
etc...
I also have an xml file 'GetProperty.xml', as follows (values omitted for security):
<?xml version="1.0" encoding="UTF-8"?>
<scAPI>
<client>
<ID></ID>
<key></key>
<siteID></siteID>
<propertyID></propertyID>
</client>
</scAPI>
Lastly, I have a PHP method which posts the xml to the desired endpoint, and which accepts two arguments (the xml as $xmlstr and the endpoint as $url):
class apiClass {
public function curl($xmlpost, $url) {
$ch = curl_init();
curl_setopt_array($ch, array(
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_URL => $url,
CURLOPT_POST => 1,
CURLOPT_POSTFIELDS => $xmlpost
));
if(curl_exec($ch) === false)
{
echo 'Curl error: ' . curl_error($ch);
}
$xml = curl_exec($ch);
curl_close($ch);
return $xml;
}
}
I now need to:
take each ID in turn
edit the value of the propertyID node in the xml to match the ID
and then POST that XML to the desired endpoint using the curl method
I need to iterate through the ID array with a foreach loop to POST each ID in turn (or do something altogether different if there is a better way).
Thank you for help, in advance.
Related
So i'm trying to scrape this page:
http://www.asx.com.au/asx/statistics/todayAnns.do
it seems that my code can't get the whole page html code , it acts very wierd.
I've tried with simple html dom, but nothing works.
$base = "http://www.asx.com.au/asx/statistics/todayAnns.do";
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_URL, $base);
curl_setopt($curl, CURLOPT_REFERER, $base);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$str = curl_exec($curl);
curl_close($curl);
echo htmlspecialchars($str);
This shows mostly javascript and i can't get the page. My goal is to scrape that middle table on the url.
If you don't need the most recent data then you can use the cached version of the page from Google.
<?php
use Scraper\Scrape\Crawler\Types\GeneralCrawler;
use Scraper\Scrape\Extractor\Types\MultipleRowExtractor;
require_once(__DIR__ . '/../vendor/autoload.php');
date_default_timezone_set('UTC');
// Create crawler
$crawler = new GeneralCrawler(
'http://webcache.googleusercontent.com/search?q=cache:http://www.asx.com.au/asx/statistics/todayAnns.do&num=1&strip=0&vwsrc=0'
);
// Setup configuration
$configuration = new \Scraper\Structure\Configuration();
$configuration->setTargetXPath('//div[#class="page"]//table');
$configuration->setRowXPath('.//tr');
$configuration->setFields(
[
new \Scraper\Structure\TextField(
[
'name' => 'Headline',
'xpath' => './/td[3]',
]
),
new \Scraper\Structure\TextField(
[
'name' => 'Published',
'xpath' => './/td[1]',
]
),
new \Scraper\Structure\TextField(
[
'name' => 'Pages',
'xpath' => './/td[4]',
]
),
new \Scraper\Structure\AnchorField(
[
'name' => 'Link',
'xpath' => './/td[5]/a',
'convertRelativeUrl' => false,
]
),
new \Scraper\Structure\TextField(
[
'name' => 'Code',
'xpath' => './/text()',
]
),
]
);
// Extract data
$extractor = new MultipleRowExtractor($crawler, $configuration);
$data = $extractor->extract();
print_r($data);
I was able to get the following data using above code.
Array
(
[0] => Array
(
[Code] => ASX
[hash] => 6e16c02b10a10baf739c2613bc87f906
)
[1] => Array
(
[Headline] => Initial Director's Interest Notice
[Published] => 10:57 AM
[Pages] => 1
[Link] => /asx/statistics/displayAnnouncement.do?display=pdf&idsId=01868833
[Code] => STO
[hash] => aa2ea9b1b9b0bc843a4ac41e647319b4
)
[2] => Array
(
[Headline] => Becoming a substantial holder
[Published] => 10:53 AM
[Pages] => 2
[Link] => /asx/statistics/displayAnnouncement.do?display=pdf&idsId=01868832
[Code] => AKG
[hash] => f8ff8dfde597a0fc68284b8957f38758
)
[3] => Array
(
[Headline] => LBT Investor Conference Call Business Update
[Published] => 10:53 AM
[Pages] => 9
[Link] => /asx/statistics/displayAnnouncement.do?display=pdf&idsId=01868831
[Code] => LBT
[hash] => cc78f327f2b421f46036de0fce270a6d
)
...
Disclaimer: I used https://github.com/rajanrx/php-scrape framework and
I am an author of that library. You can grab data using simple curl as well using the
xpath listed above.I hope this might be helpful :)
CURL can load only markup of the page. The above page uses javascript to load data after page has been loaded. You might have to use PhantomJS or Splash.
This link might help : https://stackoverflow.com/a/20554152/3086531
For fetching data, on serverside, We can use phantomjs as library inside PHP. Execute page inside phantomjs, then dump data into php using exec command.
This article has step-by-step process to do it. http://shout.setfive.com/2015/03/30/7817/
I got an issue here. I am trying make a request to a Web Appi: http://www.speedex.gr/getvoutrans/getvoutrans.asmx?WSDL
And I am sending a request to insertPodData();
I am using PHP and SOAP.
I am succesfull at connecting and giving the correct credentials. However I am not able to send a Dataset (cause I do not know the right way), so i get an empty dataset.
Datasets are for .NET lang. So it is kind of tricky with the php.
I tried already to send it as an array, i still get an empty result.
Here are some coding.
PHP:
$dataset = array(
'schema' => array(
'Enter_Branch_Id' => $speedex_branch_id,
'SND_Customer_Id' => $speedex_cust_id,
'SND_Agreement_Id' => $speedex_appi_key,
'RCV_Name' => 'Test',
'RCV_Addre1' => 'Test Adress',
'RCV_Zip_Code' => '54636',
'RCV_City' => 'Thessaloniki',
'RCV_Country' => 'Greece',
'RCV_Tel1' => '*******',
'Voucher_Weight' => '0',
),
'any' => ''
);
try {
$soap = new SoapClient("http://www.speedex.gr/getvoutrans/getvoutrans.asmx?WSDL",array('trace' => true));
$oAuthResult = $soap->insertPodData(
array(
'username' => $speedex_usrn,
'password' => $speedex_psw,
'VoucherTable' => $dataset,
'_tableFlag' => 3
)
);
$resultVoucher = $oAuthResult;
print_r($resultVoucher);
echo '<br>';
echo "REQUEST:\n" . htmlentities($soap->__getLastRequest()) . "\n";
die();
}catch(SoapFault $fault) {
die('<h1>Ooooops something is broken. Refresh or contact module creator </h1><br>'.$fault);
}
This is returning this result
RESULT: stdClass Object ( [insertPodDataResult] => 1 [newVoucherTable] => stdClass Object ( [schema] => [any] => ) )
REQUEST:
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns1="http://tempuri.org/" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<SOAP-ENV:Body>
<ns1:insertPodData>
<ns1:username>****</ns1:username>
<ns1:password>****</ns1:password>
<ns1:VoucherTable>********************TestTest Adress54636ThessalonikiGreece********</ns1:VoucherTable>
<ns1:_tableFlag>3</ns1:_tableFlag>
</ns1:insertPodData>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
As you can observe the Dataset is not created and all the values are passed with out a reference.
Any ideas clues? Thanks in advance!
I'm trying to parse the mobile.de API.
I have this code to get the XML:
$titan = TitanFramework::getInstance( 'MWPC' );
$handle = curl_init();
$sellerID = $titan->getOption( 'mwpc_seller_id' ).' HTTP/1.0';
$auth_token = base64_encode($titan->getOption( 'mwpc_api_usr' ) . ':' . $titan->getOption( 'mwpc_api_pass' ));
curl_setopt_array(
$handle,
array(
CURLOPT_URL => 'http://services.mobile.de/1.0.0/ad/search?customerId='.$sellerID,
CURLOPT_POST => false,
CURLINFO_CONTENT_TYPE => 'application/xml',
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HTTPHEADER => array(
'Authorization: Basic '. $auth_token,
'accept: application/xml',
'Accept-Language: de, en'
)
)
);
$response = curl_exec($handle);
I get the following XML Response:
<search:result xmlns:resource="http://services.mobile.de/schema/resource" xmlns:seller="http://services.mobile.de/schema/seller" xmlns:ad="http://services.mobile.de/schema/ad" xmlns:search="http://services.mobile.de/schema/search" xmlns:financing="http://services.mobile.de/schema/common/financing-1.0" xmlns:error="http://services.mobile.de/schema/common/error-1.0" total="28" page-size="20" current-page="1" max-pages="2">
<ad:ad key="123456" url="http://services.mobile.de/1.0.0/ad/...">
<ad:creation-date value="2014-12-08T09:51:39+01:00"/>
<ad:modification-date value="2015-01-09T12:56:16+01:00"/>
<ad:detail-page url="http://suchen.mobile.de/auto-inserat/..."/>
<ad:vehicle>
<ad:class key="Car" url="http://services.mobile.de/1.0.0/refdata/classes/Car">
<resource:local-description xml-lang="de">Pkw</resource:local-description>
</ad:class>
<ad:category key="Cabrio" url="http://services.mobile.de/1.0.0/refdata/categories/Cabrio">
<resource:local-description xml-lang="de">Cabrio/Roadster</resource:local-description>
</ad:category>
<ad:make key="MINI" url="http://services.mobile.de/1.0.0/refdata/classes/Car/makes/MINI">
<resource:local-description xml-lang="de">MINI</resource:local-description>
</ad:make>
<ad:model key="COOPER_SD_CABRIO" url="http://services.mobile.de/1.0.0/refdata/classes/Car/makes/MINI/models/COOPER_SD_CABRIO">
<resource:local-description xml-lang="de">Cooper SD Cabrio</resource:local-description>
</ad:model>
And the XML object handling:
$XML_Obj = simplexml_load_string($response);
echo '<pre>';
print_r($XML_Obj);
echo '</pre>';
The above code will output:
SimpleXMLElement Object
(
[#attributes] => Array
(
[total] => 28
[page-size] => 20
[current-page] => 1
[max-pages] => 2
)
)
How can I echo the data from inside the "ad:foo"?
I stuck at this point!! Was googling for many hours :(
edit:
If I use this code from suggestion I get an empty array:
$att = $XML_Obj->xpath("//ad[#key='car']");
You can use SimpleXML's xpath function to search by attributes. e.g.:
$XML_Obj = simplexml_load_string($response);
$att = $XML_Obj->xpath("//ad[#key='foo']");
Struggled two days with the same problem. First of all - never use print_r or var_dump for debug outputs on SimpleXMLElement. Instead use these functions: https://github.com/IMSoP/simplexml_debug.
Second problem is namespaces in xml. Dirty hack is str_replace to remove the XML namespace: https://www.hacksparrow.com/how-to-manhandle-xml-with-namespace-in-php.html
And final solution, that works like a charm:
https://outlandish.com/blog/tutorial/xml-to-json/
The #attributes key will have to be referenced like
$XMLobj->{'#attributes'}
You can iterate over it with a foreach()
<?php
$apiKey = 'HnUvAYIy5hO7iPki';
$apiSecret = '9HBkibOphng7w4p1ZdiiVJgzRI4kpD4Q';
$msg = filter_input(INPUT_GET,"msg",FILTER_SANITIZE_STRING);
$message = array(
'message' => array(
'message' => ''.$msg.'',
'chatBotID' => 6,
'timestamp' => time()
),
'user' => array(
'firstName' => 'Tugger',
'lastName' => 'Sufani',
'gender' => 'm',
'externalID' => 'abc-639184572'
)
);
// construct the data
$host = "http://www.personalityforge.com/api/chat/";
$messageJSON = json_encode($message);
$hash = hash_hmac('sha256', $messageJSON, $apiSecret);
$url = $host."?apiKey=".$apiKey."&hash=".$hash."&message=".urlencode($messageJSON);
// Make the call using cURL
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// make the call
$response = curl_exec($ch);
curl_close($ch);
echo 'Response JSON: '.$response.'<br>';
?>
-----------------following is the outcome of above code--------------------------------
Response JSON:
Correct parameters and objects received
raw message: {"message":{"message":"hi","chatBotID":6,"timestamp":1407393907},"user":{"firstName":"Tugger","lastName":"Sufani","gender":"m","externalID":"abc-639184572"}}
apiSecret: 9HBkibOphng7w4p1ZdiiVJgzRI4kpD4Q
Do the following two match?
a702fc336f49b099764d35a548dd110fac2067bcd14a438676e4d579d70b6afc
a702fc336f49b099764d35a548dd110fac2067bcd14a438676e4d579d70b6afc
CORRECT MATCH!
Array
(
[message] => Array
(
[message] => hi
[chatBotID] => 6
[timestamp] => 1407393907
)
[user] => Array
(
[firstName] => Tugger
[lastName] => Sufani
[gender] => m
[externalID] => abc-639184572
)
)
sent on Thu, 07 Aug 2014 2:45:07 am
6 seconds ago. (limit is 300)
{"success":1,"errorMessage":"","message":{"chatBotName":"Desti","chatBotID":"6","message":"Yes, Tugger, I've heard that one before.","emotion":"normal"}}
now i need to get the message field content that is in italics
thanks in advance.... pls help... i tried a lot :-)
if you look forward to have a regex solution here it is
Regex
"message":"(.*?)"
TestString
{"success":1,"errorMessage":"","message":{"chatBotName":"Desti","chatBotID":"6","message":"Yes, Tugger, I've heard that one before.","emotion":"normal"}
Result
MATCH 1
[91-131] Yes, Tugger, I've heard that one before.
try demo here
PHP usage
preg_match('/(?<=success).*"message":"(.*?)"/', $response , $matches);
$message = $matches[1];
regex updated to match with the full response text
try demo here
use json_decode(), it will return
(
[success] => 1
[errorMessage] =>
[message] => stdClass Object
(
[chatBotName] => Desti
[chatBotID] => 6
[message] => Yes, Tugger, I've heard that one before.
[emotion] => normal
)
Sample :
$output = json_decode($input,true);
print_r($output);
echo "<br />".$output['message']['message'];
?>
I've read about hundreds of SO-entries about that, but I can't get it to work. I can't really see what I'm doing wrong. I'm most certainly doing something obviously stupid, but at the moment I can't see it.
I'm trying to parse http://api.spreadshirt.net/api/v1/shops/614852/productTypes?locale=de_DE&fullData=false&limit=20&offset=0
This is what I'm doing:
$shopUrl = "http://api.spreadshirt.net/api/v1/shops/614852/productTypes?".
"locale=de_DE&fullData=false&limit=20&offset=0"
$ch = curl_init($shopUrl);
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, false);
$result = curl_exec($ch);
curl_close($ch);
$products = new SimpleXMLElement($result);
foreach ($products->productType as $product) {
$resources = $product->children('http://www.w3.org/1999/xlink');
$resEntity = array(
'id' => (int)$product->attributes()->id,
'name' => (string)$product->name[0],
'price' => (string)$product->price[0]->vatIncluded[0],
'preview' => $resources
);
echo '<pre>'.print_r($resEntity, true).'</pre>';
}
This outputs me
Array
(
[id] => 6
[name] => Männer T-Shirt klassisch
[price] => 9.90
[preview] => SimpleXMLElement Object
(
[#attributes] => Array
(
[href] => http://api.spreadshirt.net/api/v1/shops/614852/productTypes/6
)
)
)
I'm now trying to access the HREF-attribute but everything I've tried so far like $resources->attributes()->href or $resources['href'] but PHP keeps on saying Node no longer exists.
You must specify the namespace in the attributes() method. I guess (it's not explained in detail in the manual of attributes()) you have to specify the xml namespace with the first argument. This might get you the href attribute from the xlink namespace. Otherwise you just get the attributes from the default xml namespace, namely type and mediaType (or from which node you fetch the attributes).
It should be work like this (not tested):
$resources = $product->resources[0]; // <resources> node
$resource = $resources->resource[0]; // first <resource> node
$attribs = $resource->attributes('http://www.w3.org/1999/xlink'); // fetch all attributes from the given namespace
var_dump($attribs->href); // or maybe var_dump((string)$attribs->href);