How to generate XML sitemap for php website - php

Since now we have used wordpress for our website, and the xml sitemap was very easy to create for the whole website with some plugins.
Now we are switching to a php website created from scratch, after some google search i haven't found something to help me to understand how can i create a sitemap for my website .
Please can some one help, with any kind of software or any web script?
Thank you.

It really depends on how you have the pages served (from database, include files, what-have-you) but, if you don't have very many pages you could simply create a document called sitemap.xml by hand and place it in your root directory. There are droves of examples you could emulate from a quick google search.

To create a sitemap.xml, read up on sitemap.xml format, then create an XML file that conforms to that format.
How to do this? Well, that depends on the structure of your site. Maybe you could write it by hand, maybe you could generate it based on stuff in the database, maybe you could crawl your site, maybe you could...? This question doesn't really have a specific answer--it all depends on how your site is organized.

If you have a list of pages in a database, you can use that. Almost all websites have either a directory of static pages, a database with pages, or a combination of both. You should be able to generate a list of all your pages. If you can put them in an array, you can put them in XML as well. Use the SimpleXML extension from PHP. It is, well, Simple. :)
If you cannot generate an export this way, you could use some kind of crawler to generate a list of urls that are found when crawling your main page and all successive pages in your domain.

Related

Can I crawl websites, download specific pages, and save rendered versions as PDFs in PHP?

I just need some clarity here on whether this concept is possible or whether i have misunderstood what is capable of crawlers.
Say 1 have a list of 100 websites/blogs and every day, my program ( i am assuming its a crawler thingy ) will crwal thru them and if there is a match for some specified phrases like "miami heat" or "lebron james", it will proceed to download that page -> convert it to a pdf with full text/images and save that pdf.
So my questions are;
This type of thing is possible right ? Pls note that i dont want just text snippets but i am hoping to get the entire page as if it was printed out on a piece of paper?
This type of programs are called as crawlers right ?
i am planning to build on code from http://phpcrawl.cuab.de/about.html
This is perfectly possible, as you are going to use phpcrawl to crawl the web pages use wkhtmltopdf to convert your html to pdf as it is
Yes it is possible, by using wkhtmltopdf tool you can convert web page as it is. its a desktop bases s/w so you can install in in you machine
Yes Crawlers.
Its a perfect tool for building what you want to build.
Yes it is possible.
You could call it a crawler or a scraper, as you are scraping data off the websites.
Rendering the website to a PDF will probably be the hardest part, their are webservices that can do this for you.
For Example
http://pdfmyurl.com/
(I have no affiliation, and I have never used them, it was just the first site in the google results when I checked)

How do I properly use code in joomla articles?

I am very new to web development and CMSs. I want to make a Joomla site that features articles with a lot of graphs at the top of the page and written content below them. The charts will probably be done with fusioncharts and some controls directly below them to dynamically influence the data displayed in the charts preferably without reloading the page.
My question is what is the most appropriate way to do this in joomla? Can I get the sourcer add in and simple create articles using inline javascript calls to place the charts and controls directly in the article? Is this how people usually embed non text based content in joomla? Is it possible to access the database with code directly embedded in the article to generate the chart?
I dont really want to learn too much of the joomla API right now, I'm more interested in using the CMS features to create the pages and then just coding everything else in javascript/php directly in the page but I'm not sure if that is appropriate or if it would introduce security concerns to my site.
Why not try the FusionCharts extension for Joomla -
This will be much easier than coding this yourself, the work has already been done.
I believe the best thing to do is just use a good WSIWYG and then use the source code feature.
TinyMCE does the work just fine.
Are you looking for plugins or components to add and do this or do you just want to log into administrator and start doing this right away?

Advice/Tips on what the best way to spider/crawl/collect audio content from the internet

well what I'm actually trying to do is to figure out how BEEMP3.COM works.
Because of the site's speed, I doubt they scrape other sites/sources on the spot.
They probably use some sort of database (PostgreSQL or MySQL) to store the "results" and then just query the search terms.
My question is how do you guys think they crawl/spider or actually get the mp3 files/content?
They must have some algorithm to spider the internet OR use google's index of mp3 trick to find hosts with the raw mp3 files.
Any comments and tips or ideas are appreciated :)
QueryPath is a great tool for building a web spider.
I'm guessing they find MP3s using a combination approach - they have a list of "seed sites" (gathered from Google, Usenet or manually inserted) that they use as a starting points for the search and then set spiders running against them.
You need to write a script that will:
Take a webpage as a starting point
Fetch the webpage data (use cURL)
Use a regular expression to extract (a) any links (b) any links to mp3 files
Place any MP3 links into a database
Add the list of links to other webpages to a queue for processing through the above method
You'll also need to re-check your MP3 links regularly to erase any bad links.
Alternatively you can crawl MP3 spiders like beemp3.com and extract all direct download links and save them to your data base. you need only two file
I. Simple html Dom.
II. An application that can take extracted links to your database.
Check what i did in http://kenyaforums.com/bongomp3_external_link_search_engine_at_kenyaforums_com.php
You keep on asking in case of any contradiction.

How to display the rss version of a News Link

I'm trying to write a script that will show the rss version of a single url (title, author, image, source, etc..). This should behave much the way that facebook does when you copy paste a link to share and it generates this information automatically. I'm trying to do this with a php script but would also be open to opensource programs that can do this as well.
also, if anyone knows of any Joomla/Drupal plugins that can do this that would be great. This may eventually end up on a site run on one of these frameworks.
thanks!!
I'm not really sure what you're asking for, but here is a list of Joomla extensions that handle RSS syndication:
link text
This seems like a pretty specific application to get started with a big framework. Anyway, some Drupal modules worth checking out:
http://drupal.org/project/facebook_link
http://drupal.org/project/views
http://drupal.org/project/views_rss

get a list of all the xml pages on a website

using php how do I go to a website and receive a list of all the xml files on a site. (It would also be nice to get the last date changed also)
There is a html files linking to html pages with a xml counterpart, does that help?
You might want to use SimpleXML or file_get_contents function depending on your requirements.
Depends on what you mean by "list of all xml files". There's no way to find unlinked files without a brute force search (which is impossible in practice at this scale). If you only care about linked files, you'll either have to crawl manually and match links in the content of the pages, or do a Google search with site:example and inurl:.xml then parse the results.

Categories