I am working on my 404 error doc, and I was thinking instead of just giving a sitemap, one could suggest to the user the website he might have looked for based on what actually exists on the server.
Example: if the person typed in "www.example.com/foldr/site.html", the 404 page could output:
Did you mean "www.example.com/folder/site.html"?
For this, I wrote the following code which works for me very well. My question now is: is it "safe" to use this? As basically someone could detect all files on the server by trying all kind of combinations. Or a hacker could even use a script that loops through and lists all types of valid URLs.
Should I limit the directories this script can detect and propose? With an array of "OK"-locations, or by file type?
Had anyone else already got an idea like this?
PHP:
// get incorrect URL that was entered
$script = explode("/",$_SERVER['SCRIPT_NAME']);
$query = $_SERVER['QUERY_STRING'];
// create vars
$match = array();
$matched = "../";
// loop through the given URL folder by folder to find the suggested location
foreach ($script as $dir) {
if (!$dir) {
continue;
}
if ($handle = opendir($matched)) {
while (false !== ($entry = readdir($handle))) {
if ($entry != "." && $entry != "..") {
similar_text($dir, $entry, $perc);
if ($perc > 80) {
$match[$entry] = $perc;
}
}
}
closedir($handle);
if ($match) {
arsort($match);
reset($match);
$matched .= key($match)."/";
} else {
$matched = false;
break;
}
$match = array();
}
}
// trim and echo the result that had the highest match
$matched = trim(ltrim(rtrim($matched,"/"),"."));
echo "proposed URL: ".$_SERVER["SERVER_NAME"].$matched;
Yup, you can see it as this:
Imagine a house with only glass walls on the outside, but it's night. You're a thief (hacker) and you want to check the house for worthfull loot (files with passwords, db connections etc).
If you don't protect (certain) files, you would be putting the lights on in every part of the house. The thief would look through the windows and see that you have loot - now the only the he would have to do is get in and take it.
If you do protect the files, the thief won't even be able to know that there was any loot in the house, and thus would the thief have a higher chance of moving on to the next house.
Related
I made a tool to organize file content in a specific way. These files are located all over my pc which runs on Windows 7. The tool is made up of two parts: 1 the interface holding a form. 2 the script to do the work.
Instead of having to manually write the full path to a certain directory to the main script, I'd rather have the tool search for it and retrieve it seamlessly. I'm thinking of maybe adding a textfield and a button, in which I can enter the directory's name I'm looking for and after clicking the button, retrieve the directory's full pathname, print it to the same textfield and then pass it along to the program itself.
I've searched for several day for ways to have PHP interact with Windows (maybe with window's search object), but all I've found is very looong documentation on the COM and then on the NET. These however seem to strictly deal with accessing Office objects, since most of the available examples are about Excel or Word objects.
How can I accomplish the functionality I want to add to my interface?
To avoid further confusion, this is the image of the Window's object I'm referring to > Windows Starup Search Field
Use this handy function - just point it to the root of your filesystem and it will return an array with all the matching files - I mean, matched by the regular expression pattern you provide to the function.
// PREG_FIND_RECURSIVE - go into subdirectorys looking for more files
// PREG_FIND_DIRMATCH - return directorys that match the pattern also
// PREG_FIND_DIRONLY - return only directorys that match the pattern (no files)
// PREG_FIND_FULLPATH - search for the pattern in the full path (dir+file)
// PREG_FIND_NEGATE - return files that don't match the pattern
// PREG_FIND_RETURNASSOC - Instead of just returning a plain array of matches,
// return an associative array with file stats
// to use more than one simply seperate them with a | character
define('PREG_FIND_RECURSIVE', 1);
define('PREG_FIND_DIRMATCH', 2);
define('PREG_FIND_FULLPATH', 4);
define('PREG_FIND_NEGATE', 8);
define('PREG_FIND_DIRONLY', 16);
define('PREG_FIND_RETURNASSOC', 32);
function preg_find($pattern, $start_dir='.', $args=NULL)
{
$files_matched = array();
$fh = #opendir($start_dir);
if($fh)
{
while (($file = readdir($fh)) !== false)
{
if (strcmp($file, '.')==0 || strcmp($file, '..')==0) continue;
$filepath = $start_dir . '/' . $file;
if (preg_match($pattern, ($args & PREG_FIND_FULLPATH) ? $filepath : $file))
{
$doadd = is_file($filepath)
|| (is_dir($filepath) && ($args & PREG_FIND_DIRMATCH))
|| (is_dir($filepath) && ($args & PREG_FIND_DIRONLY));
if ($args & PREG_FIND_DIRONLY && $doadd && !is_dir($filepath)) $doadd = false;
if ($args & PREG_FIND_NEGATE) $doadd = !$doadd;
if ($doadd)
{
if ($args & PREG_FIND_RETURNASSOC) // return more than just the filenames
{
$fileres = array();
if (function_exists('stat'))
{
$fileres['stat'] = stat($filepath);
$fileres['du'] = $fileres['stat']['blocks'] * 512;
}
//if (function_exists('fileowner')) $fileres['uid'] = fileowner($filepath);
//if (function_exists('filegroup')) $fileres['gid'] = filegroup($filepath);
//if (function_exists('filetype')) $fileres['filetype'] = filetype($filepath);
//if (function_exists('mime_content_type')) $fileres['mimetype'] = mime_content_type($filepath);
if (function_exists('dirname')) $fileres['dirname'] = dirname($filepath);
if (function_exists('basename')) $fileres['basename'] = basename($filepath);
//if (isset($fileres['uid']) && function_exists('posix_getpwuid ')) $fileres['owner'] = posix_getpwuid ($fileres['uid']);
$files_matched[$filepath] = $fileres;
}
else array_push($files_matched, $filepath);
}
}
if ( is_dir($filepath) && ($args & PREG_FIND_RECURSIVE) ) $files_matched = array_merge($files_matched, preg_find($pattern, $filepath, $args));
}
closedir($fh);
}
return $files_matched;
}
Example usage:
$arr = preg_find('/./','z:\temp');
var_dump($arr);
Example output:
Another example:
$arr = preg_find('/\.tmp$/i','z:\temp',PREG_FIND_RECURSIVE | PREG_FIND_DIRMATCH);
var_dump($arr);
On the first place, thanks to #IVO GELOV for the handy script you generously shared with me.
In cases someone else needs this info.
After much searching, I found out that I only needed to use Tkinter to navigate through directories. It offers a Dialog Box with two methods: one to get the full path to a file and the other to a directory, which was what I needed. The path can be stored in a variable.
Thanks to all for the input.
I'm attempting to add a feature to our intranet, which will allow users to log onto the intranet, and access documents stored within a Windows network SAN.
At the moment, I've successfully retrieved all the file and folder names within a specified users 'My Documents'.
I'm having difficulty removing hidden files and folders from the array.
At the moment, I can remove all folders and files starting with ..
However on Windows, they're being marked as 'hidden' in the properties. I've googled and found lots of resources about how to mark a file as hidden, and how to hide files that start with a ., but none on how to remove hidden windows files / folders. One post on stackoverflow mentions to use DirectoryIterator, but at the moment, but haven't explained at all how to use it to check if a files marked as hidden.
We have over 1000 users, with approximately 500MB - 1GB of documents, with multiple layers of directories, so It needs to be relatively fast.
For clarification:
During a recursive iteration on a Windows system, how can I find out whether a directory is hidden or not, without relying on a prepended . symbol?
Ok, so worked it out, with help from the exec() function, so use with care!
I'm using CodeIgniter, so I've modified the directory_helper.php function slightly, as its installed on a windows box, it'll always need to check for the hidden files, but it should also work for non-codeigniter sites:
function directory_map($source_dir, $directory_depth = 0, $hidden = FALSE)
{
if ($fp = #opendir($source_dir))
{
if(!$hidden)
{
$exclude = array();
exec('dir "' . $source_dir . '" /ah /B', $exclude);
}
$filedata = array();
$new_depth = $directory_depth - 1;
$source_dir = rtrim($source_dir, DIRECTORY_SEPARATOR).DIRECTORY_SEPARATOR;
while (FALSE !== ($file = readdir($fp)))
{
// Remove '.', '..', and hidden files [optional]
if ( ! trim($file, '.') OR ($hidden == FALSE && $file[0] == '.') OR ($hidden === FALSE && in_array($file, $exclude)))
{
continue;
}
if (($directory_depth < 1 OR $new_depth > 0) && #is_dir($source_dir.$file))
{
$filedata[$file] = directory_map($source_dir.$file.DIRECTORY_SEPARATOR, $new_depth, $hidden);
}
else
{
$filedata[] = $file;
}
}
closedir($fp);
return $filedata;
}
return FALSE;
}
This scanned 2207 files, and 446 folders in approx 11 seconds (Ages I know, but the best I could do). Tested it on 500 folders and 200 files, and did it in around 3 seconds.
Its a recursive function which will scan each non-hidden directory. The first thing it does is scan the current directory for all hidden files and folders using the exec('dir *directory* /ah /B') function.
It will then store the results in an array and make sure that the current file/directory being read isn't in that array.
Here´s my recent result for recursive listing of a user directory.
I use the results to build a filemanger (original screenshots).
(source: ddlab.de)
Sorry, the 654321.jpg is uploaded several times to different folders, thats why it looks a bit messy.
(source: ddlab.de)
Therefor I need two separate arrays, one for the directory tree, the other for the files.
Here only showing the php solution, as I am currently still working on javascript for usability. The array keys contain all currently needed infos. The key "tree" is used to get an ID for the folders as well as a CLASS for the files (using jquery, show files which are related to the active folder and hide which are not) a.s.o.
The folder list is an UL/LI, the files section is a sortable table which includes a "show all files"-function, where files are listed completely, sortable as well, with path info.
The function
function build_tree($dir,$deep=0,$tree='/',&$arr_folder=array(),&$arr_files=array()) {
$dir = rtrim($dir,'/').'/'; // not really necessary if 1st function call is clean
$handle = opendir($dir);
while ($file = readdir($handle))
{
if ($file != "." && $file != "..")
{
if (is_dir($dir.$file))
{
$deep++;
$tree_pre = $tree; // remember for reset
$tree = $tree.$file.'/'; // bulids something like "/","/sub1/","/sub1/sub2/"
$arr_folder[$tree] = array('tree'=>$tree,'deep'=>$deep,'file'=>$file);
build_tree($dir.$file,$deep,$tree,$arr_folder,$arr_files); // recursive function call
$tree = $tree_pre; // reset to go to upper levels
$deep--; // reset to go to upper levels
}
else
{
$arr_files[$file.'.'.$tree] = array('tree'=>$tree,'file'=>$file,'filesize'=>filesize($dir.$file),'filemtime'=>filemtime($dir.$file));
}
}
}
closedir($handle);
return array($arr_folder,$arr_files); //cannot return two separate arrays
}
Calling the function
$build_tree = build_tree($udir); // 1st function call, $udir is my user directory
Get the arrays separated
$arr_folder = $build_tree[0]; // separate the two arrays
$arr_files = $build_tree[1]; // separate the two arrays
see results
print_r($arr_folder);
print_r($arr_files);
It works like a charme,
Whoever might need something like this, be lucky with it.
I promise to post the entire code, when finished :-)
I have created my own l($text) function in php for a multi lingual website. i use it like this in my documents :
echo '<h1>' . l('Title of the page') . '</h1';
echo '<p>' . l('Some text here...') . '</p>';
My question is, with a php script, how can i scan all my .php files to catch all this function usages and list all the arguments used into a mysql table?
the goal, of course, is to not forget any sentences in my traduction files.
I didn't find anything on google or here, so if you have any ideas, or need some more information.
Could you:
read all *.php files with glob()
then use a regex to pull the strings out (preg_match())
strings simple mysql insert?
Seems simple enough?
i just finished, your help was usefull ! :-)
here is my ugly code for those who can be interested. it's not beautifuly coded, but not made to be loaded 10000 times per day so...
<?php
// define a plain text document to see what appen on test
header('Content-Type: text/plain; charset=UTF-8');
$dossier = 'pages/'; // folder to scan
$array_exclude = array('.', '..', '.DS_Store'); // system files to exclude
$array_sentences_list = array();
if(is_dir($dossier)) // verify if is a folder
{
if($dh = opendir($dossier)) // open folder
{
while(($file = readdir($dh)) !== false) // scan all files in the folder
{
if(!in_array($file, $array_exclude)) // exclude system files previously listed in array
{
echo "\n".'######## ' . strtoupper($file) . ' ##########'."\n";
$file1 = file('pages/'.$file); // path to the current file
foreach($file1 AS $fileline)
{
// regex : not start with a to z characters or a (
// then catch sentences into l(' and ')
// and put results in a $matchs array
preg_match_all("#[^a-z\(]l\('(.+)'\)#U", $fileline, $matchs);
// fetch the associative array
foreach($matchs AS $match_this)
{
foreach($match_this AS $line)
{
// technique of "I do not want to break my head"
if(substr($line, 0, 3) != "l('" AND substr($line, 0, 4) != " l('" AND substr($line, 0, 4) != ".l('")
{
// check if the sentence is not already listed
if(!in_array($line, $array_sentences_list))
{
// if not, add it to the sentences list array and write it for fun !
$array_sentences_list[] = $line;
echo $line . "\n";
}
}
}
}
}
}
}
closedir($dh);
}
}
?>
small precision : i do have to escape various cases as :
-> CSS : background: url('image.jpg');
and
-> jQuery : $(this).html('bla bla');
so here is why the regex starts with [^a-z(] :-)
it works very well now! just have to finish later with recording entries in a mysql table and ensure that i can load the script from time to time when there are changes on the site... keep the existing translation, overwrite the existing files etc... no problem with that.
thanks a gain, this website is really helpful ! :-)
I want to have an application that displays all of my website's external links and outputs a diagram. Like for example www.example.com/articles/some-title.html is linked to my homepage.
Home
- www.example.com/some-text
- www.another-site.com/my-title
- www.example.com/articles/some-title.html Products
Products
- www.buy-now.com/product-reviews/231/098989
- www.sales.com/432/title-page.html Categories
- www.ezinearticles.com/blah-blah-blah
Something like SlickMap, but not on CSS.
I have setup a table on my DB so this will be dynamic and more links to come. I'm using CakePHP in working on this. Any ideas/suggestions?
Thanks for your time.
You can see slickmap, is a css implementation for site diagrams
http://astuteo.com/slickmap/
You can use PHP to retrieve the results from the database and you can use jQuery's treeView to display them.
Also, raphaël.js might be of interest, especially its diagram plugin, its fully customizable and should be something to check out.
If I am understanding you correctly, you want to parse the contents of an entire web site (HTML, JS, etc...), and create an array that contains all of your links, as well as the pages that they can be found on. If that is correct, this code will get the job done:
<?php
$path = "./path_to_your_files/";
$result = array();
if ( $handle = opendir($path) ) {
while (false !== ($file = readdir($handle))) {
if ($file != "." && $file != "..") {
$contents = file_get_contents($path . $file);
preg_match_all("/a[\s]+[^>]*?href[\s]?=[\s\"\']+"."(.*?)[\"\']+.*?>"."([^<]+|.*?)?<\/a>/", $contents, $parts);
foreach ( $parts[1] as $link ) {
$result[$file][] = $link;
}
}
}
closedir($handle);
}
print_r($result);
?>