Site Converter - Website Copier - php

Does anybody know of a software program that will convert a website built with PHP, JSON and jquery into a mainly HTML format. We need to do a conversion for SEO purposes and don't want to have to rewrite the whole site.

HTML is a language used for markup, PHP is an object oriented functional language. You cannot convert one to the other, I'm sorry.

If you're trying to make sure that you have nothing but .HTML extensions on your public URLs for SEO purposes:
Someone's selling you a line of BS.
You need access to your server configuration.
You don't have to convert anything but your links.
The .PHP extension is the default file extension configured to be sent from Apache to the PHP engine for parsing. You can change what file extension gets parsed in your configuration file.
http://encodable.com/parse_html_files_as_php/
This will allow you to keep .HTM files static and have .HTML files parsed as if they were .PHP files.

Try this: http://www.httrack.com/
It will only return a static HTML site. But it might be a good base for you.

Since the only thing which really knows what type of file you're using is the server itself, it does not really matter what you're using on the back end. Most search engines are smart enough to know that so they don't really care so much. Now, people might care. People might say, "Hm, well, this is .html, that means that this person must have a flat file which is constantly being updated," but I doubt it.
If you're really concerned about having a .html extension, then you can fake it by using htaccess:
RewriteRule ^(.*)\.html$ $1.php [L]
If that is placed in a .htaccess file at the root of your site, it will redirect all requests which end with .html to a corresponding page with .php. It will do that transparently both to the user and to the crawlers.
Of course, every link on your site will need to convert from linking to .php, but it will replace the impossible task of using only .html files with the annoying task of replacing all of your .php links.
As to removing JavaScript, well, you could do that, or you could design your site in such a way that it still uses AJAX but it works with the search engines instead of against them. The biggest trick is to make sure that your site can work with as little AJAX as possible and then use AJAX to supplement. We've come a long way from requiring that all websites work in lynx, but it is still good practice to make sure that they are still sane without the benefit of JS/CSS.
Besides, search engines are getting smarter. Google has been working to read AJAX intelligently since 2009. But even if they weren't, there are plenty of articles out there on using AJAX without hurting SEO.
There is no need to nerf your site because of SEO -- You can have your AJAX and SEO too.

This is hard to accomplish if there is a lot of dynamic data. For a simple website you can just cache every page and make that your new website. I am not sure how useful that would be. For example if you have forms or other user input fields then things will just not work. In any case this is how you do it using wget.
$ wget -m http://www.example.com/
More reading here.

Related

How can I securely allow web users to create files?

I'm building a website which allows certain users to write reviews, and I want a small php file to be automatically generated when they do. What's the most secure way to set up accounts/groups/file permissions to allow this? Ideally, I'd like the review writers to be able to change the title in case they make a mistake, which would require php to be able to not only create files and folders, but move and/or remove them, as well. However, that's not an absolute necessity. My test server is running Linux/Apache, the newest versions of everything, and for testing purposes I've temporarily set the owner of the main reviews folder as the server. I'm also open to other suggestions on how to make this happen. I'm not really an IT guy, but I can write shell scripts just fine.
Edit:
Thanks to the selected answer, I was able to come up with a solution. I used this guide (http://www.seomoz.org/ugc/using-mod-rewrite-to-convert-dynamic-urls-to-seo-friendly-urls), and modified it to just load the desired php script with no variables, which I designed to retrieve the information directly from the original URL using $_SERVER['REQUEST_URI']. Here's what my .htaccess file looks like; It sends www.domain.com/reviews/the-review-filepath.php to www.domain.com/reviews/review.php.
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule !^review\.php$ review.php
It was much easier for me to do it this way because I a lot more about PHP than regular expressions.
Thanks to everyone who answered and/or commented. This is much better than the way I was trying to do it before.
Extending from comment:
If you want to get a "clean" url (e.g. /post/123/comment/456) instead of "parameterized" url (e.g. /?post=123&comment=456), you can still use database, and take advantage of mod_rewrite (since you tagged apache).

Does anyone use index.html?

As a matter of precaution i always, now, name my index page: index.php, I do this whatever the project. Does anyone use index.html regularly? Can you be concretely sure you'll never need PHP for the page? Are there any performance issues encountered by always using index.php over index.html?
For servers that don't support PHP, avoid the .php extension unless you are trying to mask the server technology by faking a .php extension.
For static sites it doesn't really matter which extension you use as long as you know your server is configured correctly (see Dominic Rodger's answer). For that matter, not many of your visitors will care whether it's a static or dynamic site. Also, some dynamic sites accept URLs that end in .html as opposed to .php.
Are there any performance issues encountered by always using index.php over index.html?
The PHP interpreter will immediately hand your output back to your web server if there is absolutely no PHP code in it (all it does is send some engine-specific headers), so the performance difference is negligible if at all existent.
You should use index.html, and then if you decide you need PHP, create an index.php, and change your DirectoryIndex directive (if you're using Apache).
on most webservers ".html" files will not get parsed with the php interpreter. so i think, yes, there is an speed advantage.
i use .html files for very small sites, without anything special, shure, why not. the will never get updated so there is no need for it.
Of course...if u have static site u know that u wont have PHP code.
i guess that for .php files the server has to parse the file even if it doesnt contain any php tags or code, but i think that its really negligible...
For sure, the main topic's answer is that if you don't need anything in PHP for your site, you can use .html/.htm in the index page - as everybody stated.
But sometimes, I use it as a awesome trick: when I want to update some webpages or I want to fix some issue within the site or even say something for the visitors, I do an upload of an index.html page saying what I want. Note that, in this case, You will need to always use index.php for the site itself - the trick will work for sure.
Of course, your server need to accept PHP files :P
I hope I helped!

What is a good way to set up a site template with PHP on IIS6?

I am not very experienced with PHP. I have a site I'm maintaining that is on IIS6 using PHP. Right now it is using include files and querystrings to server up content.
For example:
http://mysite/index.php?maincontent=services&subcontent=service1&nav=subnav1
We want to change the site so that URLs look more like (for example):
http://mysite/commercial-services.php
But, I don't want to duplicate code and include files in the 30+ files of the web site.
Is there an easy way with php to have a template that keeps the short urls but allows you to use the same layout file for multiple pages?
I do mostly .net web sites so I guess what I'm looking for is something comparable to asp.net master pages.
I also looked at php frameworks, namely codeigniter. However, that by default leaves an index.php in the middle of the url. From what I read we would need to use some type of isapi rewrite to get rid of that. I can't do that because I don't have access to the server and they don't want to install things on the server.
Is there anything simple we can use or are we limited to using includes?
Update:
For this I ended up converting the site to .net. It was much faster and easier (for me) to do than figure out how to set up something with PHP.
I'd say look at rewrites not frameworks if all you want to do is change the urls, that way you backend php can stay the same but you still get the nice urls.
There's loads of tutorials, quick google gave me:
http://articles.sitepoint.com/article/guide-url-rewriting

How to hide the url in php

Is it possible to hide the the url in the address bar of the web browser so that it won't necessarily match the location of the files.
For example, this url:
http://localhost/exp/regstuds.php
You will always know by looking where to find the files in the computer.
Is it possible to distort or disarrange or hide the url in such a way that the location of the files will not be revealed
Yes, if you're using Apache look into using mod_rewrite. There are similar rewrite modules for pretty much all other web servers too.
I hope your sole motivation for doing this is not "security through obscurity". Because if it is, you should probably stop and spend more time on something more effective.
If you are hosting your php on an Apache server, you probably have the ability to use the mod_rewrite utility. You can do this be adding rules to your .htaccess file...
RewriteEngine on
RewriteRule ^RegStuds/ regstuds.php
This would cause http://localhost/RegStuds/ to actually render regstuds.php, but without ever displaying it in the address bar.
If you are on IIS, you can perform the same function using an ISAPI Rewrite Filter.
If you don't have mod_rewrite or an ISAPI Rewrite Filter, you can get a similar result using a folder structure, so you would have a physical path of RegStuds/index.php - and you would never need to link to "index.php" as it is the default file. This is the least recommended way of doing it.
No its not.
Each bit of functionality must have a unique identifier (URI) so that the request is routed to the right bit of code. The mapping can be non-linear using all sorts of tricks - mod_rewrite, front controller, content negotiation...but this is just obscuring what's really going on.
You can fudge what appears in the address bar on the browser by using a front-controller architecture and using forms / POSTs for every request but this is going to get very messy, very quickly.
Perhaps if you were to explain why you wanted to do this we might be able to come up with a better solution.
C.

Running other file types as PHP

Is there any problem with running HTML as PHP via .htaccess? such as security or best practices etc. was doing this to make URLs cleaner.
## run the following file types as php
Addhandler application/x-httpd-php .html .htm .rss .xml
Well ideally id like to have my URLs like
localhost/blog/posts/view.php?id=64
to be
localhost/projects/bittyPHP/bittyphp/posts/view/id-64
But having trouble accomplishing that without routing everything to one file and having PHP run determine the paths. I guess this is my real question
I would use mod rewrite.
Probably you do not need to run all html files as PHP, and if you have short_tags enabled "<?" in XML will give you trouble.
Keep in mind that you will run each and every of those files through the PHP handler then. If there is no PHP inside the files, the parser will still inspect them to see if there is any PHP in it. This adds some overhead, but it is likely neglectable in most setups.
Main issue I would say is performance. If you have a significant number of plain HTML files then you're creating unnecessary overhead by always running them through the PHP interpretter.
Best practice is not to do this, but use "friendly" URLS like mysite.com/item/123 and use mod_rewrite to convert them to mysite.com/displayitem.php?id=123 internally
Like many people have already stated, mod_rewrite is the best solution for accomplishing friendly URLs.
Sitepoint has a decent guide to getting started with mod_rewrite.

Categories