PHP prevent double clean url (improvements?) - php

For a client at work we have build a website.The website has an offering page which can contain variants of the same type/build, so they ran into problems with double clean-urls.
Just now I wrote a function to prevent that from happening by appending a number to the URL. If thatclean url also exists it counts up.
E.g.
domain.nl/product/machine
domain.nl/product/machine-1
domain.nl/product/machine-2
Updated! return $clean_url; on recursion and on return
The function I wrote works fine, but I was wondering if I have taken the right approach and if it maybe could be improved. Here's the code:
public function prevent_double_cleanurl($cleanurl)
{
// makes sure it doesnt check against itself
if($this->ID!=NULL) $and = " AND product_ID <> ".$this->ID;
$sql = "SELECT product_ID, titel_url FROM " . $this->_table . " WHERE titel_url='".$cleanurl."' " . $and. " LIMIT 1";
$result = $this->query($sql);
// if a matching url is found
if(!empty($result))
{
$url_parts = explode("-", $result[0]['titel_url']);
$last_part = end($url_parts);
// maximum of 2 digits
if((int)$last_part && strlen($last_part)<3)
{
// if a 1 or 2 digit number is found - add to it
array_pop($url_parts);
$cleanurl = implode("-", $url_parts);
(int)$last_part++;
}
else
{
// add a suffix starting at 1
$last_part='1';
}
// recursive check
$cleanurl = $this->prevent_double_cleanurl($cleanurl.'-'.$last_part);
}
return $cleanurl;
}

Depending on the likeliness of a "clean-url" being used multiple times, your approach may not be the best to roll with. Say there was "foo" to "foo-10" you'd be calling the database 10 times.
you also don't seem to sanitize the data you shove into your SQL queries. Are you using mysql_real_escape_string (or its mysqli, PDO, whatever brother)?
Revised code:
public function prevent_double_cleanurl($cleanurl) {
$cleanurl_pattern = '#^(?<base>.*?)(-(?<num>\d+))?$#S';
if (preg_match($cleanurl_pattern, $base, $matches)) {
$base = $matches['base'];
$num = $matches['num'] ? $matches['num'] : 0;
} else {
$base = $cleanurl;
$num = 0;
}
// makes sure it doesnt check against itself
if ($this->ID != null) {
$and = " AND product_ID <> " . $this->ID;
}
$sql = "SELECT product_ID, titel_url FROM " . $this->_table . " WHERE titel_url LIKE '" . $base . "-%' LIMIT 1";
$result = $this->query($sql);
foreach ($result as $row) {
if ($this->ID && $row['product_ID'] == $this->ID) {
// the given cleanurl already has an ID,
// so we better not touch it
return $cleanurl;
}
if (preg_match($cleanurl_pattern, $row['titel_url'], $matches)) {
$_base = $matches['base'];
$_num = $matches['num'] ? $matches['num'] : 0;
} else {
$_base = $row['titel_url'];
$_num = 0;
}
if ($base != $_base) {
// make sure we're not accidentally comparing "foo-123" and "foo-bar-123"
continue;
}
if ($_num > $num) {
$num = $_num;
}
}
// next free number
$num++;
return $base . '-' . $num;
}
I don't know about the possible values for your clean-urls. Last time I did something like this, my base could look like some-article-revision-5. That 5 being part of the actual bullet, not the duplication-index. To distinguish them (and allow the LIKE to filter out false positives) I made the clean-urls look like $base--$num. the double dash could only occur between the base and the duplication-index, making things a bit simpler…

I have no way to test this, so its on you, but here's how I'd do it. I put a ton of comments in there explaining my reasoning and the flow of the code.
Basically, the recursion is unnecessary will result in more database queries than you need.
<?
public function prevent_double_cleanurl($cleanurl)
{
$sql = sprintf("SELECT product_ID, titel_url FROM %s WHERE titel_url LIKE '%s%%'",
$this->_table, $cleanurl);
if($this->ID != NULL){ $sql.= sprintf(" AND product_ID <> %d", $this->ID); }
$results = $this->query($sql);
$suffix = 0;
$baseurl = true;
foreach($results as $row)
{
// Consider the case when we get to the "first" row added to the db:
// For example: $row['titel_url'] == $cleanurl == 'domain.nl/product/machine'
if($row['title_url'] == $cleanurl)
{
$baseurl = false; // The $cleanurl is already in the db, "this" is not a base URL
continue; // Continue with the next iteration of the foreach loop
}
// This could be done using regex, but if this works its fine.
// Make sure to test for the case when you have both of the following pages in your db:
//
// some-hyphenated-page
// some-hyphenated-page-name
//
// You don't want the counters to get mixed up
$url_parts = explode("-", $row['titel_url']);
$last_part = array_pop($url_parts);
$cleanrow = implode("-", $url_parts);
// To get into this block, three things need to be true
// 1. $last_part must be a numeric string (PHP Duck Typing bleh)
// 2. When represented as a string, $last_part must not be longer than 2 digits
// 3. The string passed to this function must match the string resulting from the (n-1)
// leading parts of the result of exploding the table row
if((is_numeric($last_part)) && (strlen($last_part)<=2) && ($cleanrow == $cleanurl))
{
$baseurl = false; // If there are records in the database, the
// passed $cleanurl isn't the first, so it
// will need a suffix
$suffix = max($suffix, (int)$last_part); // After this foreach loop is done, $suffix
// will contain the highest suffix in the
// database we'll need to add 1 to this to
// get the result url
}
}
// If $baseurl is still true, then we never got into the 3-condition block above, so we never
// a matching record in the database -> return the cleanurl that was passed here, no need
// to add a suffix
if($baseurl)
{
return $cleanurl;
}
// At least one database record exists, so we need to add a suffix. The suffix we add will be
// the higgest we found in the database plus 1.
else
{
return sprintf("%s-%d", $cleanurl, ($suffix + 1));
}
}
My solution takes advantage of SQL wildcards (%) to reduce the number of queries from n down to 1.
Make sure that you ensure problematic case I described in lines 14-20 works as expected. Hyphens in the machine name (or whatever it is) could do unexpected things.
I also used sprintf to format the query. Make sure you sanitize any string that is passed through as a string (e.g. $cleanurl).
As #rodneyrehm points out, PHP is very flexible with what it considers a numeric string. You might consider switching out is_numeric() for ctype_digit() and see how that works.

Related

How to use string variable that have special character "#" in select query statement mysql php

<?php
include('dbLink2.php');
$quizqr = $_GET['quizQR'];
$recordsID1 = $_GET['recordsID1'];
$recordsID2 = $_GET['recordsID2'];
$m_array1=array();
$m_array=array();
$sql = "SELECT quizQR, recordsID FROM `registertestactivity` WHERE (quizQR = '$quizqr' OR recordsID = '$recordsID1' OR recordsID = '$recordsID2') LIMIT 1";
$result = #mysqli_query($link, $sql) or die();
if (#mysqli_affected_rows($link) > 0) {
while($row = #mysqli_fetch_assoc($result))
{
$m_array[]=$row;
}
} else {
$m_array1 += ["quizQR" => "NoRecords"];
$m_array1 += ["recordsID" => "NoRecords"];
$m_array[0] = $m_array1;
}
echo json_encode($m_array);
#mysqli_free_result($result);
#mysqli_close($link);
?>
Can someone help me out, i have tried the mysqli_real_escape_string and it still doesnt work :(
The $quizqr value has a '#' character in the string and this is the error msg that pops when the ajax call this php:
Because you have a # in the URL you're dealing with a URL Fragment which means that everything past the # is not available in the query string. PHP offers a flag, PHP_URL_FRAGMENT for its parse_url() function which can help you get what you need from the string.
Here is one example using the URL you provided:
$fragment = parse_url($url, PHP_URL_FRAGMENT);
echo $fragment;
$fragmentSection = explode('&', $fragment);
print_r($fragmentSection);
foreach($fragmentSection AS $section) {
if(0 != strpos($section, '=')) {
$sectionParts = explode('=', $section);
$queryParts[$sectionParts[0]] = $sectionParts[1];
}
}
print_r($queryParts);
This ultimately returns two array members which could then be used in your query:
Array
(
[recordsID1] => records_001
[recordsID2] => records_002
)
The best thing to do would be to write a function to which you pass the URL to return the elements you need.
Keep in mind that this is not fool-proof. If the URL is in a different format then what I have done here will have to be modified to work as you would like it to.
Additionally you have been given some warnings and guidance in the comments you should follow to keep your code safe and efficient, so I will not repeat them here.

Distinct values from while loop

I have a database with a field called part_name.
with the values:
Front Control Arm, Rear Control Arm
I need to echo out only Control Arm, once only.
As of now what i'm getting on the while loop results is
Control Arm, Control Arm.
Need to echo out distinct values from while loop results. I can't do it on the SQL query SELECT DISTINCT because i'm preg replacing the value from the row that i'm queering on the database.
while ($rowsparts = mysql_fetch_array($displayparts)) {
$part=''.$rowsparts['part_name'].'';
$part = preg_replace('/\bFront\b/u', '', $part);
$part = preg_replace('/\bRear\b/u', '', $part);
echo '<li>'.$part.'</li>';
}
I need to echo out the $part variable distinct values, only once per part name.
Control Arm only.
This might work if you want to only display the part name once:
$lastpart = '';
while ($rowsparts = mysql_fetch_array($displayparts)) {
$part = $rowsparts['part_name'];
$part = trim(str_replace('Rear','',str_replace('Front','',$part)));
if($lastpart != $part) {
$lastpart = $part;
echo "<li>".$part."</li>\n";
} else {
echo "<li>Part name duplicate: $part, lastpart: $lastpart</li>\n";
}
}
I added the else for debugging. It will show the two vars that are used to detect duplicates. You would remove it after testing.
This might work if you want to only display the part name once when the part names are not in order.
It builds a list of part names, and checks each part names against the list, showing
only the ones not found in the list.
$part_list = array();
while ($rowsparts = mysql_fetch_array($displayparts)) {
$part = $rowsparts['part_name'];
$part = trim(str_replace('Rear','',str_replace('Front','',$part)));
if(!isset($part_list[$part])) {
$part_list[$part] = 1;
echo "<li>".$part."</li>\n";
}
}

Loop through an array to create an SQL Query

I have an array like the following:
tod_house
tod_bung
tod_flat
tod_barnc
tod_farm
tod_small
tod_build
tod_devland
tod_farmland
If any of these have a value, I want to add it to an SQL query, if it doesnt, I ignore it.
Further, if one has a value it needs to be added as an AND and any subsequent ones need to be an OR (but there is no way of telling which is going to be the first to have a value!)
Ive used the following snippet to check on the first value and append the query as needed, but I dont want to copy-and-paste this 9 times; one for each of the items in the array.
$i = 0;
if (isset($_GET['tod_house'])){
if ($i == 0){
$i=1;
$query .= " AND ";
} else {
$query .= " OR ";
}
$query .= "tod_house = 1";
}
Is there a way to loop through the array changing the names so I only have to use this code once (please note that $_GET['tod_house'] on the first line and tod_house on the last line are not the same thing! - the first is the name of the checkbox that passes the value, and the second one is just a string to add to the query)
Solution
The answer is based heavily upon the accepted answer, but I will show exactly what worked in case anyone else stumbles across this question....
I didnt want the answer to be as suggested:
tod_bung = 1 AND (tod_barnc = 1 OR tod_small = 1)
rather I wanted it like:
AND (tod_bung = 1 OR tod_barnc = 1 OR tod_small = 1)
so it could be appended to an existing query. Therefore his answer has been altered to the following:
$qOR = array();
foreach ($list as $var) {
if (isset($_GET[$var])) {
$qOR[] = "$var = 1";
}
}
$qOR = implode(' OR ', $qOR);
$query .= " AND (" .$qOR . ")";
IE there is no need for two different arrays - just loop through as he suggests, if the value is set add it to the new qOR array, then implode with OR statements, surround with parenthesis, and append to the original query.
The only slight issue with this is that if only one item is set, the query looks like:
AND (tod_bung = 1)
There are parenthesis but no OR statements inside. Strictly speaking they arent needed, but im sure it wont alter the workings of it so no worries!!
$list = array('tod_house', 'tod_bung', 'tod_flat', 'tod_barnc', 'tod_farm', 'tod_small', 'tod_build', 'tod_devland', 'tod_farmland');
$qOR = array();
$qAND = array();
foreach ($list as $var) {
if (isset($_GET[$var])) {
if (!empty($qAND)) {
$qOR[] = "$var = 1";
} else {
$qAND[] = "$var = 1";
}
$values[] = $_GET[$var];
}
}
$qOR = implode(' OR ', $qOR);
if ($qOR != '') {
$qOR = '(' . $qOR . ')';
}
$qAND[] = $qOR;
$qAND = implode(' AND ', $qAND);
echo $qAND;
This will output something like tod_bung = 1 AND (tod_barnc = 1 OR tod_small = 1)
As the parameter passed to $_GET is a string, you should build an array of strings containing all the keys above, iterating it and passing the values like if (isset($_GET[$key])) { ...
You could then even take the key for appending to the SQL string.
Their are a lot of ways out their
$list = array('tod_house', 'tod_bung', 'tod_flat', 'tod_barnc', 'tod_farm', 'tod_small', 'tod_build', 'tod_devland', 'tod_farmland');
if($_GET){
$query = "";
foreach ($_GET as $key=>$value){
$query .= (! $query) ? " AND ":" OR ";
if(in_array($key,$list) && $value){
$query .= $key." = '".$value."'";
}
}
}
Sure you have to take care about XSS and SQL injection
If the array elements are tested on the same column you should use IN (...) rather than :
AND ( ... OR ... OR ... )
If the values are 1 or 0 this should do it :
// If you need to get the values.
$values = $_GET;
$tod = array();
foreach($values as $key => $value) {
// if you only want the ones with a key like 'tod_'
// otherwise remove if statement
if(strpos($key, 'tod_') !== FALSE) {
$tod[$key] = $value;
}
}
// If you already have the values.
$tod = array(
'tod_house' => 1,
'tod_bung' => 0,
'tod_flat' => 1,
'tod_barnc' => 0
);
// remove all array elements with a value of 0.
if(($key = array_search(0, $tod)) !== FALSE) {
unset($tod[$key]);
}
// discard values (only keep keys).
$tod = array_keys($tod);
// build query which returns : AND column IN ('tod_house','tod_flat')
$query = "AND column IN ('" . implode("','", $tod) . "')";

Give another random int if number exists in database (PHP)

I am trying to make a script to check if an int is already added to my database. If so, it will re-generate another random number and check again. If it doesn't exist, it'll insert into the database.
However, I am having troubles. If a number exists, it just prints out num exists, how would I re-loop it to check for another and then insert that? I have tried to use continue;, return true; and so on... Anyway, here is my code; hopefully someone can help me!
<?php
require_once("./inc/config.php");
$mynum = 1; // Note I am purposely setting this to one, so it will always turn true so the do {} while will be initiated.
echo "attempts: ---- ";
$check = $db->query("SELECT * FROM test WHERE num = $mynum")or die($db->error);
if($check->num_rows >= 1) {
do {
$newnum = rand(1, 5);
$newcheck = $db->query("SELECT * FROM test WHERE num = $newnum")or die($db->error);
if($newcheck->num_rows >= 1) {
echo $newnum . " exists! \n";
} else {
$db->query("INSERT test (num) VALUES ('$newnum')")or die($db->error);
echo "$newnum - CAN INSERT#!#!#";
break;
}
} while(0);
}
?>
I think the logic you're looking for is basically this:
do {
$i = get_random_int();
} while(int_exists($i));
insert_into_db($i);
(It often helps to come up with some functions names to simplify things and understand what's really going on.)
Now just replace the pseudo functions with your code:
do {
$i = rand(1, 5);
$newcheck = $db->query("SELECT * FROM test WHERE num = $i")or die($db->error);
if ($newcheck->num_rows >= 1) {
$int_exists = true;
} else {
$int_exists = false;
}
} while($int_exists);
$db->query("INSERT test (num) VALUES ('$i')") or die($db->error);
Of course, you can do a little more tweaking, by shortening...
// ...
if ($newcheck->num_rows >= 1) {
$int_exists = true;
} else {
$int_exists = false;
}
} while($int_exists);
...to:
// ...
$int_exists = $newcheck->num_rows >= 1;
} while($int_exists);
(The result of the >= comparison is boolean, and as you can see, you can assign this value to a variable, too, which saves you 4 lines of code.)
Also, if you want to get further ahead, try to replace your database calls with actual, meaningful functions as I did in my first example.
This way, your code will become more readable, compact and reusable. And most important of all, this way you learn more about programming.
The logic is incorrect here. Your do-while loop will get executed only once (as it's an exit-controlled loop) and will stop on the next iteration as the while(0) condition is FALSE.
Try the following instead:
while($check->num_rows >= 1) {
$newnum = rand(1, 5);
$newcheck = $db->query("SELECT * FROM test WHERE num = $newnum")or die($db->error);
if ($newcheck->num_rows >= 1) {
echo $newnum . " exists! \n";
} else {
$db->query("INSERT test (num) VALUES ('$newnum')") or die($db->error);
echo "$newnum - CAN ISNERT#!#!#";
break;
}
}
Sidenote: As it currently stands, your query is vulnerable to SQL injection and could produce unexpected results. You should always escape user inputs. Have a look at this StackOverflow thread to learn how to prevent SQL injection.
Here is an example of some code that I threw together using some of my previously made scripts. You will notice a few changes compared to your code, but the concept should work just the same. Hope it helps.
In my example I would be pulling the database HOST,USER,PASSWORD and NAME from my included config file
require_once("./inc/config.php");
echo "attempts: ---- ";
$running = true;
while($running == true) {
//create random number from 1-5
$newnum = rand(1,5);
//connect to database
$mysqli = new mysqli(HOST, USER, PASSWORD, NAME);
//define our query
$sql = "SELECT * FROM `test` WHERE `num` = '".$$newnum."'";
//run our query
$check_res = mysqli_query($mysqli, $sql) or die(mysqli_error($mysqli));
//check results, if num_rows >= our number exists
if (mysqli_num_rows($check_res) >= 1){
echo $newnum . " exists! \n";
}
else { //our number does not yet exists in database
$sql = "INSERT INTO `test`(`num`) VALUES ('".$newnum."')";
$check_res = mysqli_query($mysqli, $sql) or die(mysqli_error($mysqli));
if ($check_res){
echo $newnum . " - CAN ISNERT#!#!#";
// close connection to datbase
mysqli_close($mysqli);
}
else{
echo "failed to enter into database";
// close connection to database
mysqli_close($mysqli);
}
break;
}
}
I would also like to note that this will continue to run if all the numbers have been used, you may want to put in something to track when all numbers have been used, and cause a break to jump out of the loop.
Hope this helps!

Indexing text files in PHP

I have been set a challenge to create an indexer that takes all words 4 characters or more, and stores them in a database along with how many times the word was used.
I have to run this indexer on 4,000 txt files. Currently, it takes about 12-15 minutes - and I'm wondering if anyone has a suggestion for speeding things up?
Currently I'm placing the words in an array as follows:
// ==============================================================
// === Create an index of all the words in the document
// ==============================================================
function index(){
$this->index = Array();
$this->index_frequency = Array();
$this->original_file = str_replace("\r", " ", $this->original_file);
$this->index = explode(" ", $this->original_file);
// Build new frequency array
foreach($this->index as $key=>$value){
// remove everything except letters
$value = clean_string($value);
if($value == '' || strlen($value) < MIN_CHARS){
continue;
}
if(array_key_exists($value, $this->index_frequency)){
$this->index_frequency[$value] = $this->index_frequency[$value] + 1;
} else{
$this->index_frequency[$value] = 1;
}
}
return $this->index_frequency;
}
I think the biggest bottleneck at the moment is the script to store the words in the database. It needs to add the document to the essays table and then if the word exists in the table just append essayid(frequency of the word) to the field, if the word doesnt exist, then add it...
// ==============================================================
// === Store the word frequencies in the db
// ==============================================================
private function store(){
$index = $this->index();
mysql_query("INSERT INTO essays (checksum, title, total_words) VALUES ('{$this->checksum}', '{$this->original_filename}', '{$this->get_total_words()}')") or die(mysql_error());
$essay_id = mysql_insert_id();
foreach($this->index_frequency as $key=>$value){
$check_word = mysql_result(mysql_query("SELECT COUNT(word) FROM `index` WHERE word = '$key' LIMIT 1"), 0);
$eid_frequency = $essay_id . "(" . $value . ")";
if($check_word == 0){
$save = mysql_query("INSERT INTO `index` (word, essays) VALUES ('$key', '$eid_frequency')");
} else {
$eid_frequency = "," . $eid_frequency;
$save = mysql_query("UPDATE `index` SET essays = CONCAT(essays, '$eid_frequency') WHERE word = '$key' LIMIT 1");
}
}
}
You might consider profiling your app to know exactly where are your bottlenecks. This might give you a better understanding of what can be improved.
Regarding DB optimisation: check if you have an index on word column, then try lowering the number of times you access DB. INSERT ... ON DUPLICATE KEY UPDATE ..., maybe?

Categories