MySQL/PHP Query - php

I have 3 groups of fields (each group consists of 2 fields) that I have to check against some condition. I don't check each field, but some combination, for example:
group priceEurBus, priceLocalBus
group priceEurAvio, priceLocalAvio
group priceEurSelf, priceLocalSelf
My example (formatted for legibility) — how can this be improved?
$rest .="
WHERE
(
((priceEurBus+(priceLocalBus / ".$ObrKursQuery.")) <= 400)
OR
((priceEurAvio+(priceLocalAvio / ".$ObrKursQuery.")) <= 400)
OR
((priceEurSelf+(priceLocalSelf / ".$ObrKursQuery.")) <= 400)
)
";
$ObrKursQuery is the value I use to convert local currency to Euro.

Performance improvement: Your query is OR based, meaning that it will stop evaluating the conditions as soon as it finds one of them being true. Try to order your conditions in such a way that, for example, in your case, the first check is the most likely to be under 400.
Security imporvement: Use prepared statements and filter out your variables before using them. In case of the $ObrKursQuery, if it comes from a user input or an untrusted source, this is a non-quoted numeric value and you are exposed to a big variety of sql injection problems (including arithmetic sql injection: if that value is 0, you'll get a divideByZero error that can be used as a blind sql injection condition).
Readability imporvement: Be sure to always be consistent in the way you write your code, and if possible, follow some accepted de facto standard, like starting variable names lower case: $ObrKursQuery -> $obrKursQuery. Also for the sake of self documenting code, choose names for your variables that mean what they are: $ObrKursQuery -> $conversionRatio.
Maintainability/Scalability improvement: Use a constant instead of a fixed value for the 400. When you change that value in the future, you will want to change it in just one place and not all over your code.

Never use concatenation to generate your SQL, you should be using prepared SQL statements with parameters.
The only way to simplify this statement without having greater knowledge of the problem domain is to reduce the number of columns. It looks as if you've got three prices per product entry. You could create a table of product prices instead of columns of product prices and this would make it a single comparison and give you the flexibility to create yet more product prices in the future.
So you'll need to create a one->many relationship between product and prices.

Related

PHP MySQL Negative Wildcard Character List

To secure a HTML GET method form submission so that users cannot use MySQL wildcards to get the whole table values, I was looking for a list of PHP-MySQL wildcard characters.
For example, this GET method URL takes lowrange & highrange values as 1 and 100 respectively, and generates the appropriate results between that range: example.com/form.php?lowrange=1&highrange=100
But my table values may range from -10000 to +10000, & a smart alec may like to get the whole list by changing the URL as example.com/form.php?lowrange=%&highrange=% (or other special characters like *, ?, etc. etc.)
The basic purpose is to not allow anything that can lead to whole db values getting exposed in one shot.
So far, I've found the following characters to be avoided as in the preg_match:
if(preg_match('/^~`!##$%\^&\*\(\)\?\[\]\_]+$/',$url)) {
echo "OK";
}
else {
echo "NOT OK";
}
Any other characters to be included in the list to completely block the possibility of wildcard based querying?
There are string fields & numbers fields. String field have LIKE matching (where field1 like '%GET-FORM-VALUE%'), & nos. fields have equal to and BETWEEN matching (where field2 = $GET-FORM-VALUE, OR where field3 between $GET-FORM-LOVALUE and $GET-FORM-HIVALUE) $in SQL.
Thank you.
No doubt that Prepared Statements are the best implementation, & MUST be the norm.
But sometimes, one gets into a "tricky scenario" where it may not be possible to implement it. For example, while working on a client project as external vendor, I was required to do similar implementation, but without having access to the code that made the connection (like, execute_query was not possible to implement, as connection to db was differently set in another config file). So I was forced to implement the "sanitization" of incoming form values.
To that, the only way was to check what data type & values were expected, & what wild characters can be used to exploit the submission.
If that is the case with you, then the alternate solution for your situation (String LIKE matching) & (numbers EQUAL TO or BETWEEN 2 given numbers) is as follows:
As soon as form is submitted, at backend first thing to do is:
Put a check for alphabets on String, BLOCK percentage sign & underscore.
if (preg_match('/[^A-Za-z]+/', $str) && !(preg_match('/%/',$strfield)))
{
// all good...proceed to execute the query
}
else
{
// error message
}
Similarly, put a check for numbers/floats on number fields, like if (preg_match('/[^0-9]+/', $nofield))
Only if above are satisfied, then proceed to connect to database, and run the query. Add more checks on field to prevent other wild-cards, as needed.
Another option I implemented (may not necessarily fit, but mentioning as food for thought): In addition to above checks, first generate a count of records that fit the query. If count is abnormally high, then either throw error asking user to narrow the range by resubmitting, or display a limited records per page making it cumbersome for them to keep clicking.
Again to reiterate, go for Prepared Statements if you can.

Laravel multiple orderBy()

I have this code.
$array= $this->morphMany
('App\Models\Asset', 'assigned', 'assigned_type', 'assigned_to')
->withTrashed()
->orderByRaw('LENGTH(name)', 'ASC')->orderBy('name', 'ASC');
I am using it to perform a natural search on a string with alphanumeric characters, as using an alphabetic search causes strange ordering e.g.
product1
product10
product2
product20
It seems to be working flawlessly.
I have a few questions about this, mainly what is the algorithm used in orderBy? and how does the combination of both here end up giving me a natural order? I get that the combination of a length check and alphabetic check is the solution, but how does this work in laravel? is there a specific sort algorithm used here such as merge sort? I don't understand how it prioritizes one sort over the other.
I'm a total newbie to laravel. Thanks.
The problem is: item123 would come before item2 in the dictionary. To overcome this you're saying "Sort according to the dictionary only when the items have the same length otherwise shorter items come first". By that combination of rules you get:
item2 comes before item11 because it's shorter (ORDER BY LENGTH(name) takes priority)
item123 comes before item234 because it precedes it in the dictionary (Items have the same length so they are ordered by their value)
Now what algorithms MySQL uses for sorting are not important, but it's enough to know that it's optimised for speed and sorting data for huge data sets. What is important is that each sort algorithm uses a compare function to compare two values and determine their order.
MySQL constructs this function based on your ORDER BY statements and its own internal comparison rules. For example: ORDER BY LENGTH(name), name could result in a comparison as follows:
compare(x,y)
if (default_comparer(LENGTH(x.name),LENGTH(y.name)) == 0) {
return default_comparer(x.name,y.name);
} else {
return default_comparer(LENGTH(x.name),LENGTH(y.name));
}
where default_comparer would be a mock name of the default internal comparers that MySQL uses which (in the case of strings) would take a number of things into account like alphabetical order, locale, case rules etc. (In reality MySQL probably has a general comparer and then iterates through each order by statement to get the first non-zero result to return).
This are all a bit vague, I'm not a MySQL developer so I can't provide more precise information, but this is the rough image of how it works.
This has nothing to do with Laravel, your database decides how ordering is done.
Normally, when there are two or more ORDER statements, the results are first ordered by the first statement. If there are elements that have the same value for the first order statement, these are ordered by the second order statement and so on.

Setting Boost in Solr - Adding conditions

I would like to know if there is a way / method to set conditional boosts
Ex:
if( category:1 )( field_1^1.5 field_2^1.2 )else if( category:3 )( field_3^7.5 field_4^5.2 )
I'm planning to set the "qf" and "pf" parameters this way in order to boost my results, is it possible?
Conceptually - yes, it could be done using function queries (http://wiki.apache.org/solr/FunctionQuery), it contains if function, but I wasn't able to do that by myself, since i couldn't use == operator.
Also, you could write your own function query.
But anyway right now it more looks like a good place to start, not concrete answer.
I think you have two ways of doing this...
First way, is by simplifying things at index time, so maybe create other set of redundant fields in the schema (ex: boostfield_1, boostfield_2, etc), and if the document category is 1, you can set the value of boostfield_1 to field_1, and boostfield_2 to field_2. But if category is 2, you can set it to other fields.
This will allow you to use "pf" straight away without having any conditions, as you already specified the conditions at index time, and indexed the document differently based on the category. The problem with that, is you won't be able to change the score of boost values of the fields according to the category, but it is a simpler way anyway
Use the _val_, or bq parameters to specify a boost query, and you can write the same query differently, so you can write the same condition as the following:
url?q=query AND _val_:"(category:1 AND (field_1:query OR field_2:query)) OR (category:3 AND field_2:query)"
The little problem here as well is you repeat the query text in every inner query, which is not a big deal anyway.

How to find records in a database which differ only from one character to the search string?

I have database with a field 'clinicNo' and that field contains records like 1234A, 2343B, 9999Z ......
If by mistake I use '1234B' instead of '1234A' for the select statement, I want to get a result set which contains clinicNos which are differ only by a one character to the given string (ie. 1234B above)
Eg. Field may contain following values.
1234A, 1235B, 5433A, 4444S, 2978C
If I use '1235A' for the select query, it should give 1234A and 1235B as the result.
You could use SUBSTRING for your column selection, below example return '1235' with 'A to Z'
select * from TableName WHERE SUBSTRING(clinicNo, 0, 5) LIKE '1235A'
What you're looking for is called the Levenshtein Distance algorithm. While there is a levenshtein function in PHP, you really want to do this in MySQL.
There are two ways to implement a Levenshtein function in MySQL. The first is to create a STORED FUNCTION which operates much like a STORED TRANSACTION, except it has distinct inputs and an output. This is fine for small datasets, but a little slow on anything approaching several thousand rows. You can find more info here: http://kristiannissen.wordpress.com/2010/07/08/mysql-levenshtein/
The second method is to implement a User Defined Function in C/C++ and link it into MySQL as a shared library (*.so file). This method also uses a STORED FUNCTION to call the library, which means the actual query for this or the first method may be identical (providing the inputs to both functions are the same). You can find out more about this method here: http://samjlevy.com/2011/03/mysql-levenshtein-and-damerau-levenshtein-udfs/
With either of these methods, your query would be something like:
SELECT clinicNo FROM words WHERE levenshtein(clinicNo, '1234A') < 2;
It's important to remember that the 'threshold' value should change in relation to the original word length. It's better to think of it in terms of a percentage value, i.e. half your word = 50%, half of 'term' = 2. In your case, you would probably be looking for a difference of < 2 (i.e. a 1 character difference), but you could go further to account for additional errors.
Also see: Wikipedia: Levenshtein Distance.
SELECT * FROM TABLE
WHERE ClinicNo like concat(LEFT(ClinicNo,4),'%')
In general development, you could use a function like Levenshtein to find the difference between two strings and it returns you a number of "how similar they are". You probably want then the result with the most similarity.
To get Levenshtein also in MySQL, read this post.
Or just get all results and use the Levenshtein function of PHP.

How to compare data from database in user defined formulas

I have database table with fields:
pagerank
sites_in_google_index
conversion_rate
sites_in_google_index
adsense_revenue
yahoo_backlins
google_baclinks
and about 20 more parameteres - all are integers collected every day
Now I need to add user posibility to define own keys and virtualize this data on graphs, for example user can define sth like this:
NEW_KEY_1 = pagerank*google_backlinks-(conversion_rate+site_in_google_index)
and
NEW_KEY_2 = adsense_revenue*NEW_KEY_1
An now he should show this two data keys on graph for selected time period, ex: 01-12-2009 to 02-03-2011
What is the best strategy to store and evaluate such data/keys?
The easiest way would be to allow users to define the SQL selector and use in the query to your database. However, do not do this never ever (!), since it is one of the biggest security wholes you can dig!
The general idea should be to provide some kind of "custom query language" to your users. You then need a parser to validate its expressions and create an abstract syntax tree out of it. This syntax tree can then be transformed to a corresponding SQL query part.
In your specific case, you will basically end up parsing arithmetic expressions, in order to validate that the user provided field only contains such expressions and that they are valid, and serialize the corresponding tree back to SQL arithmetic expressions. Other benefits here are that you can ensure only valid field names are used, that you can limit the complexity of expressions and that you can add custom operators at a later stage.

Categories