I'm trying to create an XML feed, essentially of a bunch of job listings. I have 39 job listings in the database right now, and I'm creating the XML using SimpleXML and it's working just fine, except it's only outputting the very last record from the database in the xml. I'm sure there's an easy solution.
Looking at the code, I want each job to be inside the <job> element, and I want a new <job> element to be created for each job. All of these are enclosed inside one <source> element. Here is my PHP code, and below that is the result I'm getting - you'll see there's only one row returning instead of all 39.
<?php
Header('Content-type: text/xml');
class SimpleXMLExtended extends SimpleXMLElement {
public function addCData($cdata_text) {
$node = dom_import_simplexml($this);
$no = $node->ownerDocument;
$node->appendChild($no->createCDATASection($cdata_text));
}
}
$jobs = $dbjobs->find(array('job_title' => array('$exists' => true), 'job_title' => array('$nin'=> array('',' ', null))));
$jobs = iterator_to_array($jobs);
$xml = new SimpleXMLExtended('<source/>');
$i = 0;
foreach ($jobs as $job) {
$i++;
$xml->job = NULL;
$j = $xml->job;
$j->referencenumber = NULL;
$j->referencenumber->addCData($job['id']);
$j->title = NULL;
$j->title->addCData($job['job_title']);
$j->url = NULL;
$j->url->addCData('http://www.site.com/joblisting.php?jl=' . $job['id']);
$j->description = NULL;
$j->description->addCData($job['job_description']);
$j->company = NULL;
$j->company->addCData($job['company']);
$j->city = NULL;
$j->city->addCData($job['city']);
$j->state = NULL;
$j->state->addCData($job['state']);
$j->postalcode = NULL;
$j->postalcode->addCData('');
$j->country = NULL;
$j->country->addCData('US');
$j->date = NULL;
$j->date->addCData(date("Y-m-d", $job['added']->sec));
$j->site = NULL;
$j->site->addCData('site.com');
$j->count = NULL;
$j->count->addCData($i);
}
print($xml->asXML());
?>
And here is an example response I get:
<source>
<job>
<referencenumber>230257</referencenumber>
<title>Home Phone Representative</title>
<url>http://www.site.com/joblisting.php?jl=230257</url>
<description></description>
<company>Media LLC</company>
<city>San Jose</city>
<state>CA</state>
<postalcode></postalcode>
<country>US</country>
<date>2013-09-16</date>
<site>site.com</site>
<count>39</count>
</job>
</source>
As you can see it populates just fine but I need all the listings instead of just the last one in the loop. Thanks for your help in advance.
$xml->job = NULL;
is your wong line because you are cancelling last record.
and all $j->xxxxxx = NULL; are useless.
then code as
foreach ($jobs as $job) {
$i++;
$j = $xml->addChild('job');
$j->referencenumber->addCData($job['id']);
$j->title->addCData($job['job_title']);
(...)
is better.
to know more about check the SimpleXMLElement::addChild doc.
You should add Childs to the root:
foreach ($jobs as $job) {
$i++;
$j=$xml->addChild('job')
...
Related
I am trying to scrape this webpage. In this webpage I have to get the job title and its location. Which I am able to get from my code. But the problem is coming that when I am sending it in XML, then only one detail is going from the array list.
I am using goutte CSS selector library and also please tell me how to scrap pagination in goutte CSS selector library.
here is my code:
$httpClient = new \Goutte\Client();
$response = $httpClient->request('GET', 'https://www.simplyhired.com/search?q=pharmacy+technician&l=American+Canyon%2C+CA&job=X5clbvspTaqzIHlgOPNXJARu8o4ejpaOtgTprLm2CpPuoeOFjioGdQ');
$job_posting_location = [];
$response->filter('.LeftPane article .SerpJob-jobCard.card .jobposting-subtitle span.JobPosting-labelWithIcon.jobposting-location span.jobposting-location')
->each(function ($node) use (&$job_posting_location) {
$job_posting_location[] = $node->text() . PHP_EOL;
});
$joblocation = 0;
$response->filter('.LeftPane article .SerpJob-jobCard.card .jobposting-title-container h3 a')
->each( function ($node) use ($job_posting_location, &$joblocation, $httpClient) {
$job_title = $node->text() . PHP_EOL; //job title
$job_posting_location = $job_posting_location[$joblocation]; //job posting location
// display the result
$items = "{$job_title} # {$job_posting_location}\n\n";
global $results;
$result = explode('#', $items);
$results['job_title'] = $result[0];
$results['job_posting_location'] = $result[1];
$joblocation++;
});
function convertToXML($results, &$xml_user_info){
foreach($results as $key => $value){
if(is_array($value)){
$subnode = $xml_user_info->addChild($key);
foreach ($value as $k=>$v) {
$xml_user_info->addChild("$k",htmlspecialchars("$v"));
}
}else{
$xml_user_info->addChild("$key",htmlspecialchars("$value"));
}
}
return $xml_user_info->asXML();
}
$xml_user_info = new SimpleXMLElement('<root/>');
$xml_content = convertToXML($results,$xml_user_info);
$xmlFile = 'details.xml';
$handle = fopen($xmlFile, 'w') or die('Unable to open the file: '.$xmlFile);
if(fwrite($handle, $xml_content)) {
echo 'Successfully written to an XML file.';
}
else{
echo 'Error in file generating';
}
what i got in xml file --
<?xml version="1.0"?>
<root><job_title>Pharmacy Technician
</job_title><job_posting_location> Vallejo, CA
</job_posting_location></root>
what i want in xml file --
<?xml version="1.0"?>
<root>
<job_title>Pharmacy Technician</job_title>
<job_posting_location> Vallejo, CA</job_posting_location>
<job_title>Pharmacy Technician 1</job_title>
<job_posting_location> Vallejo, CA</job_posting_location>
<job_title>Pharmacy Technician New</job_title>
<job_posting_location> Vallejo, CA</job_posting_location>
and so on...
</root>
You overwrite the values in the $results variable. You're would need to do something like this to append:
$results[] = [
'job_title' => $result[0];
'job_posting_location' => $result[1]
];
However here is no need to put the data into an array at all, just create the
XML directly with DOM.
Both your selectors share the same start. Iterate the card and then fetch
related data.
$httpClient = new \Goutte\Client();
$response = $httpClient->request('GET', $url);
$document = new DOMDocument();
// append document element node
$postings = $document->appendChild($document->createElement('jobs'));
// iterate job posting cards
$response->filter('.LeftPane article .SerpJob-jobCard.card')->each(
function($jobCard) use ($document, $postings) {
// fetch data
$location = $jobCard
->filter(
'.jobposting-subtitle span.JobPosting-labelWithIcon.jobposting-location span.jobposting-location'
)
->text();
$title = $jobCard->filter('.jobposting-title-container h3 a')->text();
// append 'job' node to group data in result
$job = $postings->appendChild($document->createElement('job'));
// append data nodes
$job->appendChild($document->createElement('job_title'))->textContent = $title;
$job->appendChild($document->createElement('job_posting_location'))->textContent = $location;
}
);
echo $document->saveXML();
I've been trying unsuccessfully with PHP to loop through two XML files and print the result to the screen. The aim is to take a country's name and output its regions/states/provinces as the case may be.
The first block of code successfully prints all the countries but the loop through both files gives me a blank screen.
The countries file is in the format:
<row>
<id>6</id>
<name>Andorra</name>
<iso2>AD</iso2>
<phone_code>376</phone_code>
</row>
And the states.xml:
<row>
<id>488</id>
<name>Andorra la Vella</name>
<country_id>6</country_id>
<country_code>AD</country_code>
<state_code>07</state_code>
</row>
so that country_id = id.
This gives a perfect list of countries:
$xml = simplexml_load_file("countries.xml");
$xml1 = simplexml_load_file("states.xml");
foreach($xml->children() as $key => $children) {
print((string)$children->name); echo "<br>";
}
This gives me a blank screen except for the HTML stuff on the page:
$xml = simplexml_load_file("countries.xml");
$xml1 = simplexml_load_file("states.xml");
$s = "Jamaica";
foreach($xml->children() as $child) {
foreach($xml1->children() as $child2){
if ($child->id == $child2->country_id && $child->name == $s) {
print((string)$child2->name);
echo "<br>";
}
}
}
Where have I gone wrong?
Thanks.
I suspect your problem is not casting the name to a string before doing your comparison. But why are you starting the second loop before checking if it's needed? You're looping through every single item in states.xml needlessly.
$countries = simplexml_load_file("countries.xml");
$states = simplexml_load_file("states.xml");
$search = "Jamaica";
foreach($countries->children() as $country) {
if ((string)$country->name !== $search) {
continue;
}
foreach($states->children() as $state) {
if ((string)$country->id === (string)$state->country_id) {
echo (string)$state->name . "<br/>";
}
}
}
Also, note that naming your variables in a descriptive manner makes it much easier to figure out what's going on with code.
You could probably get rid of the loops altogether using an XPath query to match the sibling value. I don't use SimpleXML, but here's what it would look like with DomDocument:
$search = "Jamaica";
$countries = new DomDocument();
$countries->load("countries.xml");
$xpath = new DomXPath($countries);
$country = $xpath->query("//row[name/text() = '$search']/id/text()");
$country_id = $country[0]->nodeValue;
$states = new DomDocument();
$states->load("states.xml");
$xpath = new DomXPath($states);
$states = $xpath->query("//row[country_id/text() = '$country_id']/name/text()");
foreach ($states as $state) {
echo $state->nodeValue . "<br/>";
}
I'm stuck on something extremely simple.
Here is my xml feed:
http://xml.betfred.com/Horse-Racing-Daily.xml
Here is my code
<?php
function HRList5($viewbets) {
$xmlData = 'http://xml.betfred.com/Horse-Racing-Daily.xml';
$xml = simplexml_load_file($xmlData);
$curdate = date('d/m/Y');
$new_array = array();
foreach ($xml->event as $event) {
if($event->bettype->attributes()->bettypeid == $viewbets){//$_GET['evid']){
// $eventid = $_GET['eventid'];
// if ($limit == $c) {
// break;
// }
// $c++;
$eventd = substr($event->attributes()->{'date'},6,2);
$eventm = substr($event->attributes()->{'date'},4,2);
$eventy = substr($event->attributes()->{'date'},0,4);
$eventt = $event->attributes()->{'time'};
$eventid = $event->attributes()->{'eventid'};
$betname = $event->bettype->bet->attributes()->{'name'};
$bettypeid = $event->bettype->attributes()->{'bettypeid'};
$betprice = $event->bettype->bet->attributes()->{'price'};
$betid = $event->bettype->bet->attributes()->{'id'};
$new_array[$betname.$betid] = array(
'betname' => $betname,
'viewbets' => $viewbets,
'betid' => $betid,
'betname' => $betname,
'betprice' => $betprice,
'betpriceid' => $event->bettype->attributes()->{'betid'},
);
}
ksort($new_array);
$limit = 10;
$c = 0;
foreach ($new_array as $event_time => $event_data) {
// $racedate = $event_data['eventy'].$event_data['eventm'].$event_data['eventd'];
$today = date('Ymd');
//if($today == $racedate){
// if ($limit == $c) {
// break;
//}
//$c++;
$replace = array("/"," ");
// $eventname = str_replace($replace,'-', $event_data['eventname']);
//$venue = str_replace($replace,'-', $event_data['venue']);
echo "<div class=\"units-row unit-100\">
<div class=\"unit-20\" style=\"margin-left:0px;\">
".$event_data['betprice']."
</div>
<div class=\"unit-50\">
".$event_data['betname'].' - '.$event_data['betprice']."
</div>
<div class=\"unit-20\">
<img src=\"betnow.gif\" ><br />
</div>
</div>";
}
}//echo "<strong>View ALL Horse Races</strong> <strong>>></strong>";
//var_dump($event_data);
}
?>
Now basically the XML file contains a list of horse races that are happening today.
The page I call the function on also declares
<?php $viewbets = $_GET['EVID'];?>
Then where the function is called I have
<?php HRList5($viewbets);?>
I've just had a play around and now it displays the data in the first <bet> node
but the issue is it's not displaying them ALL, its just repeating the 1st one down the page.
I basically need the xml feed queried & if the event->bettype->attributes()->{'bettypeid'} == $viewbets I want the bet nodes repeated down the page.
I don't use simplexml so can offer no guidance with that - I would say however that to find the elements and attributes you need within the xml feed that you ought to use an XPath query. The following code will hopefully be of use in that respect, it probably has an easy translation into simplexml methods.
Edit: Rather than targeting each bet as the original xpath did which then caused issues, the following should be more useful. It targets the bettype and then processes the childnodes.
/* The `eid` to search for in the DOM document */
$eid=25573360.20;
/* create the DOM object & load the xml */
$dom=new DOMDocument;
$dom->load( 'http://xml.betfred.com/Horse-Racing-Daily.xml' );
/* Create a new XPath object */
$xp=new DOMXPath( $dom );
/* Search the DOM for nodes with particular attribute - bettypeid - use number function from XSLT to test */
$oCol=$xp->query('//event/bettype[ number( #bettypeid )="'.$eid.'" ]');
/* If the query was successful there should be a nodelist object to work with */
if( $oCol ){
foreach( $oCol as $node ) {
echo '
<h1>'.$node->parentNode->getAttribute('name').'</h1>
<h2>'.date('D, j F, Y',strtotime($node->getAttribute('bet-start-date'))).'</h2>';
foreach( $node->childNodes as $bet ){
echo "<div>Name: {$bet->getAttribute('name')} ID: {$bet->getAttribute('id')} Price: {$bet->getAttribute('price')}</div>";
}
}
} else {
echo 'XPath query failed';
}
$dom = $xp = $col = null;
I have an XML schema that looks as follows:
<xml>
<user id="1">
<first_name>Bill</first_name>
<last_name>Steve</last_name>
<phone_numbers>
<work>xxx-xxx-xxxx</work>
<home>xxx-xxx-xxxx</home>
</phone_numbers>
</user>
<user id="2">
........
</user>
</xml>
Im working on parsing all of this information into PHP using DOM. Ex.
$userInfo = $doc->getElementsByTagName( "user" );
foreach($userInfo as $row)
{
$first_name = $row->getElementsByTagName("first_name");
}
When I try to nest this to select the phone numbers however I get an error. I've tried using XPath to select the phone numbers with equal problems. I tried something along the lines of
$userInfo = $doc->getElementsByTagName( "user" );
foreach($userInfo as $row)
{
$phoneInfo = $row->getElementsByTagName("phone_numbers");
foreach($phoneInfo as $row2)
{
$work = $row2->getElementsByTagName("work");
}
}
Im curious if Im doing something fundamentally wrong, or how to get this going. I've been tearing my hair out for a few hours now.
You can't get the value directly from a DOMNodeList Object, try this :
$userInfo = $doc->getElementsByTagName( "user" );
foreach($userInfo as $row)
{
$phoneInfo = $row->getElementsByTagName("phone_numbers");
foreach($phoneInfo as $row2)
{
// get the value from the first child
$work = $row2->getElementsByTagName("work")->item(0)->nodeValue;
$home = $row2->getElementsByTagName("home")->item(0)->nodeValue;
}
}
Well, you could switch it to SimpleXml which makes this type of parsing easier:
$userInfo = $doc->getElementsByTagName( "user" );
foreach ($userInfo as $user) {
$node = simplexml_import_dom($user);
$id = (string) $node['id'];
$first = (string) $node->first_name;
$last = (string) $node->last_name;
$workPhone = (string) $node->phone_numbers->work;
$homePhone = (string) $node->phone_numbers->home;
}
Now, in DomDocument, you could do this by using DomXpath:
$userInfo = $doc->getElementsByTagName( "user" );
$xpath = new DomXpath($doc);
foreach ($userInfo as $user) {
$id = $user->getAttribute('id');
$first = $xpath->query('//first_name', $user)->item(0)->textContent;
$last = $xpath->query('//last_name', $user)->item(0)->textContent;
$work = $xpath->query('//phone_numbers/work', $user)->item(0)->textContent;
$home = $xpath->query('//phone_numbers/home', $user)->item(0)->textContent;
}
Note that the above code (both parts) require that the format is exactly that. If you have conditionals, you might want to change it to something like this (the firstname conditional only):
$userInfo = $doc->getElementsByTagName( "user" );
$xpath = new DomXpath($doc);
foreach ($userInfo as $user) {
$id = $user->getAttribute('id');
$firstQuery = $xpath->query('//first_name', $user);
if ($firstQuery->length > 0) {
$first = $firstQuery->item(0)->textContent;
} else {
$first = '';
}
}
I have the following code at the moment:
$ip = '195.72.186.157';
$xmlDoc = new DOMDocument();
$xmlDoc->loadXML(file_get_contents('http://www.geoffmeierhans.com/services/geo-locator/locate/?ip='.$ip.'&output=xml'));
foreach($xmlDoc->getElementsByTagName('city') as $link) {
$links = array('text' => $link->nodeValue);
}
$city = $links['text'];
echo $city;
Is there a better way to get the city variable? Since there is only one tag called city a loop isn't really needed but I can't get it to work any other way
Well, you can use the length parameter to DomNodeList (what's returned by the getElementsByTagName call.
If you want only the first result:
$nodes = $xmlDoc->getElementsByTagName('city');
if ($nodes->length > 0) {
$city = $nodes->item(0)->nodeValue;
} else {
$city = ''; // There is no city element
}