How to effectively build sitemap using PHP? - php

I have build a simple site map script, i am not able to get URL output in URL field.
My PHP Script.
header("Content-Type: text/xml;charset=iso-8859-1");
echo '<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
';
require_once('_ls-global/php/sr-connect.php');
$db = mysql_select_db($database,$connection) or trigger_error("SQL", E_USER_ERROR);
$sqlquery = mysql_query("SELECT * FROM $tablename ORDER by id")or die (mysql_error());
while ($list = mysql_fetch_assoc($sqlquery)){
$pflink=$list['pflink'];
$pagelink=$list['pagelink'];
$site="http://mysite.com";
$url='$site/$pflink/$pagelink';
$changefreq="weekly";
$priority="1.0";
echo '<url>
<loc>'.$url.'</loc>
<changefreq>'.$changefreq.'</changefreq>
<priority>'.$priority.'</priority>
</url>';
}
echo '</urlset>';
The Output of this script is this.
<url>
<loc>$site/$pflink/$pagelink</loc>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
If i change $url='$site/$pflink/$pagelink'; to $url="$site/$pflink/$pagelink";
then i get only one value and error "XML Parsing Error: not well-formed".
Please see and suggest any modification to make it work.
Thanks

I guess you are having characters in the vars which are messing up the XML.
For example &, ä, <, >... You need to encode the content correctly.
Try wrapping the output:
At first change $url to $url = $site .'/'. $pflink .'/'. $pagelink; and then update the output of the XML to:
<?php
// ...
echo '<url>
<loc><![CDATA['.$url.']]></loc>
<changefreq>'.$changefreq.'</changefreq>
<priority>'.$priority.'</priority>
</url>';
?>
Explanation to CDATA available at http://en.wikipedia.org/wiki/CDATA

Based on thedom and FrontEndJohn answers and comment I got it right this way.
Changing $url='$site/$pflink/$pagelink'; to $url = $site .'/'. $pflink .'/'. $pagelink;
And modifying.
echo '<url>
<loc>'.$url.'</loc>
<changefreq>'.$changefreq.'</changefreq>
<priority>'.$priority.'</priority>
</url>';
to
echo '<url>';
echo '<loc><![CDATA['.$url.']]></loc>';
echo '<changefreq>'.$changefreq.'</changefreq>';
echo '<priority>'.$priority.'</priority>';
echo '</url>';
Hope this helps others too.

If I understand your problem correctly, you cannot currently get the value of the variable due to the use of ' but when trying to use " so that the variables echo the XML gets upset.
Try:
$url = $site . '/' . $pflink . '/' . $pagelink;
This will give the value of the variables without using ". If I have miss-understood you please let me know.
Edit: Thinking about it, it looks more the the value of one or more of the variables may be what is upsetting the XML, assuming the variables are not giving their values while using '. It could be worth checking the contents of the variables for issues if you have not done so already.

Related

Open XML document in echo

Good, again I ask your help, I have an xml document (http://inlivefm.6te.net/AirPlayHistory.xml) which provides the name of songs played.
What I'm trying is to remove the information from xml with php echo, I realized a php code but must be wrong, because it gives me nothing so I came to ask your help in solving this problem.
<?php
$xml = simplexml_load_file("http://inlivefm.6te.net/AirPlayHistory.xml");
print $xml->Event->Song['title'];
echo '';
?>
<?php
$xml = simplexml_load_file("http://inlivefm.6te.net/AirPlayHistory.xml");
print $xml->Event->Song->Artist['name'];
echo '';
?>
Someone I can help?
Thank you before too.
simplexml does not see root element. Write it so:
$xml = simplexml_load_file("http://inlivefm.6te.net/AirPlayHistory.xml");
foreach($xml->Song as $item)
echo $item->Artist['name'] . " - " . $item['title'] ."<br>";

XML with PHP “echo” getting error “Extra content at the end of the document”

I have asked a question here on how to Generate a sitemap automatically, does it need to be XML?
Here is the solution we have concluded:
<?php
header ("Content-Type:text/xml");
echo '<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">';
// code to extract and echo links from the file
echo ' </urlset>';
?>
<?PHP
// Original PHP code by Chirp Internet: www.chirp.com.au
// Please acknowledge use of this code by including this header.
$url = "assets/includes/menu.inc";
$input = #file_get_contents($url) or die("Could not access file: $url");
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
if(preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) {
foreach($matches as $match) {
// $match[2] = link address
// $match[3] = link text
echo '<url><loc>' . $match[2] . '</loc></url>';
}
}
?>
However, when I tried it shows the error here: http://postimg.org/image/gh5d0k4sx/ - I tried to remove the top line "header ("Content-Type:text/xml");" and it worked, but can I remove that line? the whole thing is for the sake of SEO so I don't know if we can delete the top line, what I am doing wrong?
Anther question: is this file now recognized as an XML file? even that it has .php extension?
Your PHP doesn't get picked up by the browser since it's a server side language.
The header function doesn't modify the body of the page. It's important to keep it in though, or the browser will not recognize the document as XML.
Try removing the closing and opening PHP tags between the two parts of the script. The whitespace inbetween them may be causing your error.
?>
<?PHP
Edit: following the comments, wait until you output your <url> tags before closing urlset
Move the line to the bottom of the PHP:
echo ' </urlset>';
It's also in the best interests of clean XML to understand how to use line breaks and double quotes to achieve a similar effect.
The file will not be picked up by crawlers.
According to the sitemap spec at http://www.sitemaps.org/protocol.html.
The filename must be sitemap.xml.
I would suggest creating the file "sitemap.xml" with
file_put_contents("sitemap.xml", $xmlContent);
A static file is faster and you can cronjob it's re-creation.
How to create a cronjob on linux?
on shell use cronjob -e
your editor opens up
add a new cronjob line, like: 00 22 * * * /path/to/sitemapBuilder.php
this means: execute your sitemap generation script, every day at 22:00
Content of sitemapBuilder.php:
<?php
$xml = '<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">';
$url = "assets/includes/menu.inc";
$input = #file_get_contents($url) or die("Could not access file: $url");
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
if(preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) {
foreach($matches as $match) {
// $match[2] = link address
// $match[3] = link text
$xml .= '<url><loc>' . $match[2] . '</loc></url>';
}
}
$xml .= '</urlset>';
file_put_contents('sitemap.xml', $xml);
?>
Write sitemap.xml to the root folder of your web project, e.g. next to index.php.
You might also use a Sitemap Validator to point to your URL and check the file for validity.
For example, http://www.validome.org/google/ might help with that.

Can parse one document, but not another

I have two XML documents, both formatted like this:
<?xml version="1.0" ?>
<article>
<body>
<![CDATA[
*some text*
]]>
</body>
</article>
and I want to echo them using this:
<?php
$xml = simplexml_load_file("." . $filename);
echo $xml->body;
?>
But one of them works, the other just echos nothing. What is going on?
UPDATE:
The document which produces the error contains this appostrophe: '
When this apostrophe is removed, the code works. I need some way of escaping characters like this, how can I do it?
Just echo asXML() you may see your error with the second file.
echo $xml->asXML();
Here is a simple tutorial on SimpleXML: http://php.net/manual/en/simplexml.examples-basic.php
Espace your appostrophe:
<?php
$text = file_get_contents("." . $filename);
$text = str_replace("'", "&apos;", $text);
$xml = simplexml_load_string($text);
echo $xml->body;
?>
Also, someone had a similar problem (no crash but garbage characters) and came up with the same solution. A bit later in that forum thread they speculate on utf8_encode and utf8_decode, which you could also try. Link: http://board.phpbuilder.com/showthread.php?10359181-RESOLVED-SimpleXML-apostrophe-problem&p=10886946&viewfull=1#post10886946

Parsing xml file with .php extension and php header

I have a slight problem. I need to parse a file and populate a web banner with the results. Problem is, the file is called : "_banner_products.php" and it's contents are as follows:
<?php header('Content-Type:text/xml'); ?><?php echo '<?xml version="1.0" encoding="UTF-8"?>'; ?>
<carouselle>
<firstLayer>
<layerName>Leica Disto X310</layerName>
<layerProduct>Disto X310</layerProduct>
<layerPic>http://www.leicaestonia.ee/extensions/boxes_design/flashpics/1334482548.jpg</layerPic>
<layerPrice>0,-</layerPrice>
<layerPriceOld></layerPriceOld>
<layerLink>http://www.leicaestonia.ee/index.php?id=11627</layerLink>
<layerTimer>01.05.2012 00:00</layerTimer>
</firstLayer>
<firstLayer>
.....
.....
</firstLayer>
</carouselle>
How can I loop through this file to group all the "firstLayer" children into one and so on..
If I just use:
$file = fopen("_banner_products.php", "r");
while (!feof($file)){
$line = fgets($file);
}
simplexml_load_file throws this-
"_banner_products.php:1: parser error : Start tag expected, '<' "
Then I only get the contents of the <...> tags meaning there is no way for me to differentiate if I am out of the scope already.
Thanks for anyone responding. If anything is unclear I´ll try to explain more.
EDIT.
Thank you for the solution, indeed using the full URL worked:
simplexml_load_file("http://localhost/MySite/_banner_products.php");
You are having issue because simplexml_load_file is treating your file like a local xml file .. what you need to do is add the full URL
Example
simplexml_load_file("http://localhost/web/_banner_products.php");
Use Case getting layerName for example
_banner_products.php
<?php
header ( 'Content-Type:text/xml' );
echo '<?xml version="1.0" encoding="UTF-8"?>';
?>
<carouselle>
<firstLayer>
<layerName>Leica Disto X310</layerName>
<layerProduct>Disto X310</layerProduct>
<layerPic>http://www.leicaestonia.ee/extensions/boxes_design/flashpics/1334482548.jpg</layerPic>
<layerPrice>0,-</layerPrice>
<layerPriceOld></layerPriceOld>
<layerLink>http://www.leicaestonia.ee/index.php?id=11627</layerLink>
<layerTimer>01.05.2012 00:00</layerTimer>
</firstLayer>
<firstLayer>
<layerName>Leica Disto X310</layerName>
<layerProduct>Disto X310</layerProduct>
<layerPic>http://www.leicaestonia.ee/extensions/boxes_design/flashpics/1334482548.jpg</layerPic>
<layerPrice>0,-</layerPrice>
<layerPriceOld></layerPriceOld>
<layerLink>http://www.leicaestonia.ee/index.php?id=11627</layerLink>
<layerTimer>01.05.2012 00:00</layerTimer>
</firstLayer>
</carouselle>
view details
$xml = simplexml_load_file("http://localhost/lab/stockoverflow/_banner_products.php");
echo "<pre>" ;
foreach($xml as $key => $element)
{
echo $element->layerName , PHP_EOL ;
}
The most obvious way to do this is to strip out the first line, and add the XML declaration back in with your code.
You could also parse the file with PHP, using eval(), but be very sure about what you are parsing, as this could be a very large security hole.

Creating an XML sitemap with PHP

I'm trying to create a sitemap that will automatically update. I've done something similiar with my RSS feed, but this sitemap refuses to work. You can view it live at http://designdeluge.com/sitemap.xml I think the main problem is that its not recognizing the PHP code. Here's the full source:
<?php
include 'includes/connection.php';
header("Content-type: text/xml");
echo '<?xml version="1.0" encoding="UTF-8" ?>';
?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.google.com/schemas/sitemap/0.84 http://www.google.com/schemas/sitemap/0.84/sitemap.xsd">
<url>
<loc>http://designdeluge.com/</loc>
<lastmod>2010-04-20</lastmod>
<changefreq>weekly</changefreq>
<priority>1.00</priority>
</url>
<url>
<loc>http://designdeluge.com/about.php</loc>
<lastmod>2010-04-20</lastmod>
<changefreq>never</changefreq>
<priority>0.5</priority>
</url>
<?php
$entries = mysql_query("SELECT * FROM Entries");
while($row = mysql_fetch_assoc($entries)) {
$title = stripslashes($row['title']);
$date = date("Y-m-d", strtotime($row['timestamp']));
echo "
<url>
<loc>http://designdeluge.com/".$title."</loc>
<lastmod>".$date."</lastmod>
<changefreq>never</changefreq>
<priority>0.8</priority>
</url>";
} ?>
</urlset>
The problem is that the dynamic URL's (e.g. the ones pulled from the DB) aren't being generated and the sitemap won't validate. Thanks!
EDIT: Right now, I'm just trying to get the code itself working. I have it set up as a PHP file on my local testing server. The code above is being used. Right now, nothing displays nothing on screen or in the source. I'm thinking I made a syntax error, but I can't find anything. Any and all help is appreciated!
EDIT 2: Ok, I got it sorted out guys. Apparently, I had to echo the xml declaration with PHP. The final code is posted above. Thanks for your help!
If you take a look at the sitemap.xml that's generated (using view source, in your browser, for example), you'll see this :
<?php header('Content-type: text/xml'); ?>
<?xml version="1.0" encoding="UTF-8" ?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84" xmlns:xsi="http:/
...
The <?php, present in that output, shows that PHP code is not interpreted.
This is probably because your webserver doesn't recognize .xml as an extension of files that should contain PHP code.
At least two possible solutions :
Re-configure your server, so XML files go through the PHP interpreter (might not be such a good idea : that can cause problems with existing files ! )
Change the extension of your sitemap, to sitemap.php for example, so it's interpreted by your server.
I would add another solution :
Have a sitemap.php file, that contains the code
And use a RewriteRule so the sitemap.xml URL actually points to the sitemap.php file
With that, you'll have the sitemap.xml URL, which is nice (required ? ), but as the code will be in sitemap.php, it'll get interpreted.
See Apache's mod_rewrite.
The simplest solution is to add to your apache .htaccess file the following line after RewriteEngine On
RewriteRule ^sitemap\.xml$ sitemap.php [L]
and then simply having a file sitemap.php in your root folder that would be normally accessible via http://yoursite.com/sitemap.xml, the default URL where all search engines will firstly search.
The file sitemap.php shall start with
<?php header('Content-type: application/xml; charset=utf-8') ?>
<?php echo '<?xml version="1.0" encoding="UTF-8"?>' ?>
I've used William's code (thanks) and with a few small modifications it worked for me.
I think the line:
header("Content-type: text/xml");
should be the second line after the top <?php
Incidentally, just a small point to anyone else that copies it, but there is a single space character before the <?php in the first line - if you inadvertantly copy it as I did, you will spend a bit of time trying to figure out why the code won't work for you!
I had to tweak the MySql select statement a little bit too.
Finally, in the output, I have used a variable $domain so that this piece of code can be used as a template without the need to think about it (provided you use the same table name each time). The variabe is added to the connectdb.php file which is included to connect to the database.
Here is my working version of the William's code:
<?php
header("Content-type: text/xml");
echo '<?xml version="1.0" encoding="UTF-8" ?>';
include 'includes/connectdb.php';
?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.google.com/schemas/sitemap/0.84 http://www.google.com/schemas/sitemap/0.84/sitemap.xsd">
<url>
<loc>http://www.DOMAIN.co.uk/</loc>
<priority>1.00</priority>
</url>
<?php
$sql = "SELECT * FROM pages WHERE onshow = 1 ORDER BY id ASC";
$result = mysql_query($sql,$conn);
while($row = mysql_fetch_array($result))
{
$filename = stripslashes($row['filename']);
?>
<url>
<loc>http://www.<?php echo "$domain"; ?>/<?php echo "$filename" ?></loc>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
<?php } ?>
</urlset>
Here is the easiest way to creating and updating sitemap.xml file.
$actual_link = (isset($_SERVER['HTTPS']) && $_SERVER['HTTPS'] === 'on' ? "https" : "http") . "://$_SERVER[HTTP_HOST]$_SERVER[REQUEST_URI]";
require_once('database.php');
$sitemapText = '<?xml version="1.0" encoding="UTF-8"?>
<urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<url>
<loc>http://ajkhub.com/</loc>
<lastmod>2021-08-18T18:32:09+00:00</lastmod>
<priority>1.00</priority>
</url>
<url>
<loc>http://ajkhub.com/includes/about.php</loc>
<lastmod>2021-08-18T18:32:09+00:00</lastmod>
<priority>0.80</priority>
</url>
<url>
<loc>http://ajkhub.com/includes/privacy-policy.php</loc>
<lastmod>2021-08-18T18:32:09+00:00</lastmod>
<priority>0.80</priority>
</url>
<url>
<loc>http://ajkhub.com/includes/termsandcondition.php</loc>
<lastmod>2021-08-18T18:32:09+00:00</lastmod>
<priority>0.80</priority>
</url>';
$sql = "SELECT * FROM page ORDER BY id DESC LIMIT 4";
$result = mysqli_query($conn, $sql);
if (mysqli_num_rows($result) > 0) {
while($row = mysqli_fetch_assoc($result)) {
$sitemapText .= ' <url>
<loc>'.$actual_link."/".$row['page'].'</loc>
<lastmod>'.date(DATE_ATOM,time()).'</lastmod>
<priority>0.80</priority>
</url>';
}
}
$sitemapText .= '</urlset>';
$sitemap = fopen("sitemap.xml", "w") or die("Unable to open file!");
fwrite($sitemap, $sitemapText);
fclose($sitemap);

Categories