I would like to know from others' experience the best way to create sitemaps with Codeigniter. I have looked at some plugins/libraries, but all check the database for the pages. What happens if some pages on the site are static and not dynamic?
Is there any way to crawl the site using PHP and creating an XML file with the results?
A tool I have used previously for my projects is http://enarion.net/tools/phpsitemapng/download/
Which is a free tool for creating sitemap and allows functionality such as cron jobs.
What is my next step? How can I achieve this?
Well, you're problem lies in the fact that you have both dynamic and static pages. So, a crawler would work, but you'd have to generate a list of links to all dynamic pages. Then, you're crawler could hit that list and have access to all dynamic pages, and then hit directories where you have your static pages.
However, the docs on the phpsitemapng that you mention state that they will crawl a live website. So, if you have links to all of your pages accessible from those pages, then that will do what you need.
Scans files on website (slower, but will also find dynamic generated files and links)
Related
how can I develop a multipage website?
should I develop all other linked pages in same way I created the homepage?
I am intending to develop a PHP website for a store and I wanna link each products to their individual pages. but since there are lots of products its kinda tedious to create individual pages each product.
SO Instead of creating many different pages for each product can I create one dynamically changing page and use it for all the product links?I mean can I create a single page whose basic layout will always remain the same but only few content will change in accordance to the selected product.
Yes, you can do that; its very common to create a page that dynamically changes.
For example, you can create a php file that takes care of the header portion of the site, then simply call it within your index.php page as so:
<? show_header(); ?>
Take a look at this website to get started!
http://www.techiwarehouse.com/engine/d3bf2317/Building-Dynamic-Web-Pages-with-PHP
The site i provided is great to start playing with PHP, it provides good information with easy to follow steps.
Another great resource is the IBM PHP PDF:
http://www.ibm.com/developerworks/linux/tutorials/l-php/l-php-pdf.pdf
Finally my personal favorite, "Practical PHP & MySQL" PDF:
http://utopia.duth.gr/~stavtran/vivlia/PHP_and_MySQL.pdf
Its also worth noting that there are Content Management Systems out there (CMS) such as Joomla, Worpress which are very easy and powerful, a lot of these CMS's allow for plugins which will make your life so much easier. If your goal is to simply have a solid working e-commerce website that sells products and you don't really wish to do hardcore php,mysql,javascript,css,html,jquery coding ;-)
Content Management Systems
http://wordpress.org/
http://www.joomla.org/
You can implement a single page, lets call it product.php. Then you can use theGET method to dynamically call the page. It would look like this product.php?id=1 where id=1 is the primary key of the product in a MySQL table. And you could fetch various products just changing the id in url.
You can create multipage PHP website using following method:
1) Create a folder and name it “my_first_php_website”.
2) Create following folders in your “my_first_php_website” folder.
“images”, “includes”
3) Create following folders in “includes” folder.
“css”, “js”
4) Create following files in “my_first_php_website” folder.
-> “index.php”, “header.php”, “footer.php”, “sidebar.php”, “nav.php”, “404.php”, “about.php”, “functions.php”
Create following file in “css” folder.
-> “style.css”
Put your website logo in “images” folder.
-> “my_website_logo.png”
5) Now follow step by step tutorial to create your first website using PHP.
Source: Web Designer Tricks
I am currently working with a eCommerce site for a client but to keep things organized I am wanting to group some pages in sub folders. However, because of the nature of eCommerce it relies on several pages in its base folder (search.php, products.php, etc) to generate some of the pages of content and therefore I cannot really have any subfolders unless I make copies of all the necessary pages into each seperate subdirectory and then it makes everything more complicated not to mention creates issues of its own.
I hope my question is clear enough and I was wondering what options are in a situation such as this?
Is there a way I can setup a redirect page for each of these common pages inside the sub directory to redirect to the base files but maintain any POST data?
you can try store the filtered post values to session(preferred),cache,Database, or just include a php script into to the script in which you post the form data.
How can one crawl a site for all Unique links and Make/Write an XML file to the root of that Respective domain. I want something like when i call mydomain.com/generatesitemap.php And this file crawls all the links in the domain and writes them to file sitemap.xml. Is this possible in PHP with cURL?
It depends on your site. If it is simple site -- then the task is simple. Grab your site root page via curl or file_get_contents, preg_match all the links (see here, for the reference http://www.mkyong.com/regular-expressions/how-to-extract-html-links-with-regular-expression/), then recursively grab all the links, which are inside your site, do not process links, which are allready processed.
The task become more complicated when JavaScript comes to play. If navigation uses JavaScript data, it will be difficult to obtain the links. There could be other navigation tricks, like select-combobox as dropdown menu.
The task could be even more complicated if you have pages with query strings. Say you have the catalogue section. And url are like this:
/catalogue
/catalogue?section=books
/catalogue?section=papers
/catalogue?section=magazines
Is it one page or not?
And what about this one?
/feedback
/feedback?mode=sent
So you should take care of this cases.
There are many examples of such a crawlers in google search. Look at this for instance:
http://phpcrawl.cuab.de/
I have a website live cricket scores , in which dynamically i am controlling the news section.
I have my own custom build CMS system with PHP, where admin will add the news to the web portal.
If i generate the Sitemap, all dynamically created pages wont be added to the sitemap,
is this a good practice or do we need to add the dynamically created links in sitemap?
if yes, can you please share how we can add dynamic links?
One more observation, I have made, whatever the news which is added getting cached within 4 Hrs in google.
Please share your thoughts, thanks in advance
If the pages are important, then you should add them to the site map so they can be indexed for future reference. However, if the pages are going to disappear after the match, then I wouldn't put them on the site map as they may get indexed then disappear, which may have a negative impact on your search engine rankings.
You can add these dynamic pages to a site map in a couple of ways:
Whenever a new dynamic page is created, re-create your site map. Do this by looking through the database for the pages which will be valid and writing them out into an XML site map file.
When a new page is created, read the current XML site map, and insert a new entry into the relevant place.
I would say the easiest option is option 1 as you can quickly and easily build a site map without having to read what you already have. That option also means that when you remove a one of the dynamic pages, it will be removed from the site map when it is re-built without the need to read through what you have, find the entry and remove it.
Google code has a number of different options for you, some of which you can download and run, others look like they need implementing within your own code.
Yes, if these pages content needs to be referenced by search engines, of course they have to be in sitemap.
I worked on a lot of ebusiness website and of course, almost 99% of pages where dynamically generated, almost 1000 product pages versus the 3 sales conditions & legal static pages.
So the sitemap itself was dynamic and regenerated every 15 minutes (to avoid dumping the whole product base each and running thousands of queries each tim the sitemap is called).
You can use a sort of separate script to do this : I would do one static part template if you have static page, and one other embedding the dynamically generated urls.
It would be easier if you CMS already embed url management (or routing) system.
How is it possibe to generate a list of all the pages of a given website programmatically using PHP?
What I'm basically trying to achieve is to generate something like an sitemap, in nested unordered list with links for all the pages contained in a website.
If all pages are linked to one another, then you can use a crawler or spider to do this.
If there are pages that are not all linked you will need to come up with another method.
You can try this:
Add an "image bug/web beacon/web
bug" to each page you tracked as
follows:
OR
alternatively add a javascript function to each page that makes a call to /scripts/logger.php You can use any of the javascript libraries that make this super simple like Jquery, Mootools, or YUI.
Create the logger.php script, have it save the request's originating URL somewhere like a file or a database.
Pros:
- Fairly simple
Cons:
Requires edits to each page
Pages that aren't visited don't get
logged
Some other techniques that don't really fit your need to do it programatically but may be worth considering include:
Create a spider or crawler
Use a ripper such as CURL, or
Teleport Plus.
Using Google Analytics (similar to
the image bug technique)
Use a log analyzer like Webstats or a
freeware UNIX webstats analyzer
You can easly list the files with the glob function... But if the pages uses includes/requires and other stuff to mix multiple files into "one page" you'll need to import the Google "site:mysite.com" search results.. Or just create a table with the URL of every page :P
Maybe this can help:
http://www.xml-sitemaps.com/ (SiteMap Generator)