How to find directories of a website using php? - php

I am making a SEO project using PHP where I need to crawl every pages/directories of a website. But for this, I have to know or list all the directories of that website. Is it possible?
Can we do it with PHP?

No. HTTP provides no way for a client (regardless of the programming language it is written in) to ask for directory listings.
This is why search engines crawl links and make use of sitemaps.
A PHP program could inspect the directory structure of the file system of the computer it is running on, but even that wouldn't give you a good view in the general case as most websites are not simply a bunch of files served up by mapping URLs directly onto a filesystem (for example, the Front Controller design pattern is quite common).

Related

Multilanguage htaccess

I have a simple php site, which i am looking to make multi-language. My goal is to have seperate xml files holding the translations and using a function to fetch the phrases corresponding to the language of the site.
Each site has a different domain, but will use the same database and the same php files (which are deployed to each domain on one or more web servers).
However, I want the URLs to be language specific. So if i have a URL /lineups in english i want to have a danish one saying /startopstillinger
What is the best way to do this? To manually make htaccess for each site or is it possible to make rules based on the domain name in a single htaccess file?
Hope someone can shed a little light for me.
Rgds
Rasmus

Secure folder structure for php site

I'm a Java (SE, EE) developer and I have been working with these techs for almost 6 years, I have also worked with php for non-web apps.
Now I'm required to build a site in php but I have googled a lot and I can't find a standard folder structure for a php site. As may you know in Java EE there is a defined structure and with the web.xml you can define security in order to allow or deny access to folders in the web root.
So the question is: Is there a standard folder structure to bring security in a php site?
If there is not, how can I prevent access to folders in my site, without the need to use .htaccess nor moving my folders to a private area in the web server?
There is no defined structure for PHP projects. Application frameworks invariably use well-defined structures, but that is decided individually by each framework. In addition, the developer can easily work outside these structures (the price being that "automatic" features of the framework might no longer work in some cases).
In order to prevent access to directories in your site you have to do one of the things you mentioned: either use web-server-level mechanisms such as .htaccess or move the directories outside the web root entirely.
That said, in many cases there is no explicit need for such security: by strictly limiting the pieces of code that can produce immediate effects (typically down to just one front controller that boots up the application) and making sure that data is contained inside PHP code (so that the web server will not reveal the contents of files) you effectively render direct access from the outside worthless.

Is it important to use baseUrl() in Zend Framework view scripts?

Some people have reported issues with accessing, setting, or getting the right value from baseUrl() in a view script. But I'm wondering why it is necessary to use it at all, at least in a situation like mine where the ZF application is on a virtual private host (Amazon EC2) where I have full control of the directory structure and apache rewrite rules, as well as routes.
I know, for example, that in the filesystem foo.jpg lives in public/images/foo.jpg, and that the application's mod_rewrite will direct all requests to public - so in my view scripts it's a lot simpler/clearer and more efficient to write something like
<img src="/images/foo.jpg" />
instead of
<img src="<?php echo $this->baseUrl();?>/images/foo.jpg" />
What sort of future-proofing robustness or other benefit does the use of baseUrl() really provide? So far I haven't used it at all, and had no problem. But I've inherited some code that uses it, and my inclination is to strip out those uses whenever I'm editing a view script that contains them. Would I regret that later?
Used this way, it's not really useful, but on the other hand, using it this way
echo $this->baseUrl('/images/foo.jpg')
might prove to be useful in the future since you can add logic before printing the URL. Imagine that in a few years your website grows way more than you expected and you have to move all your static content to a Content delivery network (CDN) you will have to manually (or with search and replace) correct all your images/css/js instances URLs. With the baseUrl() (or as name it assetUrl()) you would just have to add your CDN's url and it will be fixed everywhere in your application.
EDIT
I found a use for the baseUrl() in the code you inherited :
It would allow you to add a common URL part to all of your links and references, in the case that your site is not at the root of the domain
i.e. : www.mysite.com/zf-app/
In your config file you would just have to add
resources.frontController.baseUrl = "/zf-app/"
for it to work, and all of your links would be prepended with that part
Perfect example. I have a couple of basic Zend-y utilities I built on separate systems. On my test platform, I just virtual host them all each with their own document root. Generally I access these tools over a remote web browser but that requires I VPN to the system running it as I didn't create these tools to be on anything more than a subnet and don't really want to expose them to an internet facing site.
So along comes android phones and things like bitweb server that allows you to run a lighttpd, php and mysql in minimalized forms on a pocket cell phone. Only problem is, it's not really set up to be powerful enough to virtual host on android operating systems.
No problem, it will allow for basic aliasing, so I just move all the tools each into their own sub-directory on my sdcard and use lighttpd mod_alias definitions to point to each and then create re-write rules for each subdirectory. But that led me to this post and others like it to fix all the static urls that pointed to href="/some/path/to/static/content"
I even had to update some urls to zend tools that were absolute paths to utilize {view}->url() calls instead.
By adding the baseUrl calls to the front of the static content, and using the view url() method for calling controller actions, I can now move the entire Zend MVC for any one of the independent tools into any directory I want and have them run from as deep in the web-tree as I desire. Zend does the rest and all it takes is 2-3 properly formatted entries in the lighttpd conf file.

PHP Application Structure/Pattern - 2 sites with shared libraries and assets

I'm having a bit of an application structure design dilemma.
I have created a web app that creates online surveys. It all works fine, but I would now like to create a new site that does different types of online surveys. This new site will be pretty much 95% similar in terms of layout, logic, functions, etc.
Rather than duplicate all the code from the current web app, I'd like the new app to share in the "fountain of knowledge" created by the current app - so to speak.
Can anyone enlighten me with their experiences of doing this sort of thing? Their best practices?
As a rough guide, I'm currently thinking of using symlinks for all the major logic files (library.php, functions.php, etc), and then deciding which logic to use based on which URL the user logged-in from.
Does that sound like a good or bad idea?
Would it be any better or worse to divide the whole system in to 3 sites, with the site in the middle containing all the common elements and logic? This middle site would have no independent use - it would be used from either of the 2 applications looking for functionality and assets, etc.
Any help and experience on this matter is very much appreciated indeed.
I'm very wary of going down a dead-end solution.
Kind Regards,
Seb
Good solution if:
you host your website yourself and creating symlinks between differents virtual hosts is not a problem
you won't have to make significative changes between the 2 websites
But instead of using symlinks, I could take advantage of PHP's include_path directive and put the common libraries in this path. This way, just write your includes relative to this path, the files will be accessible from any site you want on the same server.
The second advantage of using include_path is you can bypass any open_basedir directives which wouldn't allow you to include files which are not in the same virtual host base dir.
This is how I'd do it...
Create a core library.
Create you 2 site directories.
Create site specific code folders in
each site.
Create core library folders in each
site that simlink to the main core
library created.

Using the same php sofware for multiple domains inside the same apache webserver?

Basically, lets say i have a webserver and i resell hosting specifically for local churches.
i have 5 churches as clients, i have a simple CMS made for them they are equal copies of the same files, for each website i install the CMS , database and the website, i think it's a waste of resources.
i would like to know if i can do the following, afaik most webhosts have the following structure:
A main directory (home)
www.church1.com (church1)
www.church2.com (church2)
www.church3.com (church3)
www.church4.com (church4)
www.church5.com (church5)
basically i want the CMS to be on the Home directory, and each one of the Churches (clients) would only have a Config file, a Database ant the template regarding their websites.
so the system source code would be shared, but the website design and the database files would be completelly separated.
i'm not a webhosting or a development expert, but i know my way around, i'm sorry if the question is too basic, i'm having a hard time finding if this is possible.
EDIT: I Think Rudu's reference pretty much solved my problem!
Since you are building it yourself, put the include files (application logic) in a folder or include path that is accessible to all the domains. Then you can put your template files, images and stylesheets in the individual domain folders. If you are database driven you can check the domain $_SERVER['HTTP_HOST'] and load results from a certain table or database based off of that. You really can go a lot of different directions here if you are building it yourself.
It is possible - the answer. Exactly - there are some settings ( and now i dont remember them ) that can block it - but set up in all sites that libraries are there and be happy

Categories