Newbie trying to decipher merge command associated code

Newbie trying to decipher merge command associated code - php

Someone retired in our group and I'm trying to figure out what his merge statement (and associated code) does so I can determine how to convert some (not all) values to integer before sending up. See comments below for questions. I am an absolute newbie with Microsoft SQL and took a class in php a few years ago, but don't have much experience. I've tried googling the merge command but I'm having trouble with a couple parts in it. See my questions below. (// ?)
I've looked at:
http://php.net/manual/en/pdo.query.php
http://stackoverflow.com/questions/4336573/merge-to-target-columns-using-source-rows
http://pic.dhe.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=%2Fsqlp%2Frbafymerge.htm
I realize these are basic questions but I'm trying to figure it out and nobody around here knows.
function storeData ($form)
{
global $ms_conn, $QEDnamespace;
//I'm not sure what this is doing?? I thought this was where it was sending data up??
$qry = "MERGE INTO visEData AS Target
USING (VALUES (?,?,?,?,?,?,?,?,?,?))
AS Source (TestGUID,pqID, TestUnitID, TestUnitCountID,
ColorID, MeasurementID, ParameterValue,
Comments, EvaluatorID, EvaluationDate)
ON Target.pqID = Source.pqID
AND Target.MeasurementID=Source.MeasurementID //what is this doing?
AND Target.ColorID=Source.ColorID //what is target and source?
WHEN MATCHED THEN
UPDATE SET ParameterValue = Source.ParameterValue,
EvaluatorID = Source.EvaluatorID, //where is evaluatorID and source? My table or table we're send it to?
EvaluationDate = Source.EvaluationDate,
Comments = Source.Comments
WHEN NOT MATCHED BY TARGET THEN
INSERT (TestGUID,
pqID, TestUnitID, TestUnitCountID,
ColorID, MeasurementID,
ParameterValue, Comments,
EvaluatorID, EvaluationDate, TestIndex, TestNumber)
VALUES (Source.TestGUID, Source.pqID,
Source.TestUnitID,
Source.TestUnitCountID,
Source.ColorID, Source.MeasurementID, Source.ParameterValue,
Source.Comments, Source.EvaluatorID, Source.EvaluationDate,?,?);";
$pqID = coverSheetData($form);
$tid = getBaseTest($form['TextField6']);
$testGUID = getTestGUID($tid);
$testIndex = getTestIndex ($testGUID);
foreach ($form['visE']['parameters'] as $parameter=>$element)
{
foreach ($element as $key=>$data)
{
if ( mb_ereg_match('.+evaluation', $key) === true )
{
$testUnitData = getTestUnitData ($form, $key, $tid, $testGUID);
try
{
//I'm not sure if this is where it's sent up??
//Maybe I could add the integer conversion here??
$ms_conn->query ($qry, array(
$testGUID, $pqID,
$testUnitData[0], $testUnitData[1], $testUnitData[2],$element['parameterID'], $data, $element['comments'] $QEDnamespace->userid, date ('Y-m-d'), $testIndex, $tid));
}
catch (Zend_Db_Statement_Sqlsrv_Exception $e)
{
dataLog($e->getMessage());
returnStatus ("Failed at: " . $key);
}
}
}
}
}

This is a bit long for a comment. If you are using SQL Server, then look at the SQL Server documentation on merge. All the SQL Server documentation is on line, and it is very easy to find via Google (and perhaps even easier using Bing).
The purpose of the MERGE command is to do both inserts and updates in one step. Basically, you have a table that has new data ("source") and a table to be updated ("target"). When a record matches, then update the existing record in the target with matching record in source. When a record doesn't match, then insert it into target.
The main advantage of MERGE over two statements is not necessarily the elegant and intuitively obvious syntax. The main advantage is that all the operations occur in a single transaction, so either they all succeed or all fail as one.
The syntax actually isn't that bad. I would recommend that you set up a test database and try a few examples on your own, so you at least understand the syntax. Then, return to this code. When doing so, print out the resulting merge statement and put it in SQL Server Management Studio, where you will have nice color coded key words for the statement. Then go through it step by step, and you'll probably find that it makes lots of sense.

Related

Is it worth to save keyword <-> link relation into "hastable" like structure in mysql?

im working on PHP + MySQL application, which will crawl HDD/shared drive and index all files and directories into database, to provide "fulltext" search on it. So far im doing well, but im stuck on question, if i chosed good way how to store data into database.
On picture below, you can see part schema of my database. Thought is, that i'm saving domain (which represents part of disk which i wana to index) then there are some link(s) (which represents files and folder (with content, filepath, etc) then i have table to store sole (uniq) keywords, which i find in file/folder name or content.
And finaly, i have 16 tables linkkeyword to store relations between links and keywords. I have 16 of them because i thought it might be good to make something like hashtable, because im expecting high number of relations between link <-> keyword. (so far for 15k links and 400k keywords i have about 2.5milion of linkkeyword records). So to avoid storing so much data into one table (and later search above them) i thought that this hastable can be faster. It works like i wana to search for word, i compute it md5 and look at first character of md5 and then i know to which linkkeyword table i should use. So there is only about 150~200k records in each linkkeyword table (against 2.5milions)
So there im curious, if this approach can be of any use, or if will be better to store all linkkeyword information to single table and mysql will take care of it (and to how much link<->keyword it can work?)
So far this was great solution to me, but i crushed hard when i tried to implement regular-expression search. So user can use e.g. "tem*" which can result in temp, temporary, temple etc... In normal way when searching for word, i will conpute in md5 hash and then i know to which linkkeyword table i need to look. But for regular expression i need to get all keywords from keywords table (which matches regular expression) and then process them one by one.
Im also attaching part of code for normal keyword search
private function searchKeywords($selectedDomains) {
$searchValues = $this->searchValue;
$this->resultData = array();
foreach (explode(" ", $searchValues) as $keywordName) {
$keywordName = strtolower($keywordName);
$keywordMd5 = md5($keywordName);
$selection = $this->database->table('link');
$results = $selection->where('domain.id', $selectedDomains)->where('domain.searchable = ?', '1')->where(':linkkeyword' . $keywordMd5[0] . '.keyword.keyword LIKE ?', $keywordName)
->select('link.*,:linkkeyword' . $keywordMd5[0] . '.weight,:linkkeyword' . $keywordMd5[0] . '.keyword.keyword');
foreach ($results as $result) {
$keyExists = array_key_exists($result->linkId, $this->resultData);
if ($keyExists) {
$this->resultData[$result->linkId]->updateWeight($result->weight);
$this->resultData[$result->linkId]->addKeyword($result->keyword);
} else {
$domain = $result->ref('domain');
$linkClass = new search\linkClass($result, $domain);
$linkClass->updateWeight($result->weight);
$linkClass->addKeyword($result->keyword);
$this->resultData[$result->linkId] = $linkClass;
}
}
}
}
and regular expression search function
private function searchRegexp($selectedDomains) {
//get stored search value
$searchValues = $this->searchValue;
//replace astering and exclamation mark (counted as characters for regular expression) and replace them by their mysql equivalent
$searchValues = str_replace("*", "%", $searchValues);
$searchValues = str_replace("!", "_", $searchValues);
// empty result array to prevent previous results to interfere
$this->resultData = array();
//searched phrase can be multiple keywords, so split it by space and get results for each keyword
foreach (explode(" ", $searchValues) as $keywordName) {
//set default link result weight to -1 (default value)
$weight = -1;
//select all keywords, which match searched keyword (or its regular expression)
$keywords = $this->database->table('keyword')->where('keyword LIKE ?', $keywordName);
foreach ($keywords as $keyword) {
//count keyword md5 sum to determine which table should be use to match it links
$md5 = md5($keyword->keyword);
//get all link ids from linkkeyword relation table
$keywordJoinLink = $keyword->related('linkkeyword' . $md5[0])->where('link.domain.searchable','1');
//loop found links
foreach ($keywordJoinLink as $link) {
//store link weight, for later result sort
$weight = $link->weight;
//get link ID
$linkId = $link->linkId;
//check if link already exists in results, to prevent duplicity
$keyExists = array_key_exists($linkId, $this->resultData);
//if link already exists in result set, just update its weight and insert matching keyword for later keyword tag specification
if ($keyExists) {
$this->resultData[$linkId]->updateWeight($weight);
$this->resultData[$linkId]->addKeyword($keyword->keyword);
//if link isnt in result yet, insert it
} else {
//get link reference
$linkData = $link->ref('link', 'linkId');
//get information about domain, to which link belongs (location, flagPath,...)
$domainData = $linkData->ref('domain', 'domainId');
//if is domain searchable and was selected before search, add link to result set. Otherwise ignore it
if ($domainData->searchable == 1 && in_array($domainData->id, $selectedDomains)) {
//create new link instance
$linkClass = new search\linkClass($linkData, $domainData);
//insert matching keyword to links keyword set
$linkClass->addKeyword($keyword->keyword);
//set links weight
$linkClass->updateWeight($weight);
//insert link into result set
$this->resultData[$linkId] = $linkClass;
}
}
}
}
}
}

Your question is mostly one of opinion, so you may want to include the criteria that allow us to answer "worth it' more objectively.
It appears you've re-invented the concept of database sharding (though without distributing your data across multiple servers).
I assume you are trying to optimize search time; if that's the case, I'd suggest that 2.5 million records on a modern hardware is not a particularly big performance challenge, as long as your queries can use an index. If you can't use an index (e.g. because you're doing a regular expression search), sharding will probably not help at all.
My general recommendation with database performance tuning is to start with the simplest possible relational solution, keep tuning that until it breaks your performance goals, then add more hardware, and only once you've done that should you go for "exotic" solutions like sharding.
This doesn't mean using prayer as a strategy. For performance-critical application, I typically build a test database, where I can experiment with solutions. In your case, I'd build a database with your schema without the "sharding" tables, and then populate it with test data (either write your own population routines, or use a tool like DBMonster). Typically, I'd go for at least double the size I expect in production. You can then run and tune queries to prove, one way or another, whether your schema is good enough. It sounds like a lot of work, but it's much less work than your sharding solution is likely to bring along.
There are (as #danFromGermany comments) solutions that are optimized for text serach, and you could use MySQL fulltext search features rather than regular expressions.

Duplicate Entry Mysql TokuDB Wiht Many Clients

I have a strange situation.
Suppose I have a very simple function in php (I used Yii but the problem is general) which is called inside a transaction statement:
public function checkAndInsert($someKey)
{
$data = MyModel::model()->find(array('someKey'=>$someKey)); // search a record in the DB.If it does not exist, insert
if ( $data == null)
{
$data->someCol = 'newOne';
$data->save();
}
else
{
$data->someCol = 'test';
$data->save();
}
}
...
// $db is the instance variable used for operation on the DB
$db->transaction();
$this->checkAdnInsert();
$db->commit();
That said, if I run the script containing this function by staring many processes, I will have duplicate values in the DB. For example, if I have $someKey='pippo', and I run the script by starting 2 processes, I will have two (or more) records with column "someCol" = "newOne". This happens randomly, not always.
Is the code wrong? Should I put some constraint in DB in form of KEYs?
I also read this post about adding UNIQUE indexes to TokuDB which says that UNIQUE KEY "kills" write performance...

The approach you have is wrong. It's wrong because you delegate the authority for integrity/uniqueness check to PHP, but it's the database that's responsible for that.
In other words, you don't have to check whether something exists and then insert. That's bad because there's always some slight ping involved between PHP and MySQL and as you already saw - you can get false results for your checks.
If you need unique values for certain column or combination of columns, you add a UNIQUE constraint. After that you simply insert. If the record exists, insert fails and you can deal with it via Exception. Not only is it faster, it's also easier for you because your code can become a one-liner which is much easier to maintain or understand.

How can i UPDATE a sql field with a variable instead of a field name?

I'm new here (and not english guy, obviously), but I have a problem.
I have a SQL request, it's an UPDATE like the following :
$rep = $bdd->exec("UPDATE z_agenda SET AGENDA_1='$code'WHERE AGENDA_NOM='$agent' AND AGENDA_TYPE='code'");
BUT, and now the fun is incoming, I want to change AGENDA_1 to a variable which can contains AGENDA_1, AGENDA_2, etc. until AGENDA_31.
But it seems SQL doesn't like it.
So, anybody has an idea?
I'm completely stuck right now.
If you want more explanations, I'm here.
Sit, wait, and read some help forum
I'm adding some few code :
"
$mois = $_POST['mois']; (integer)
$debut = $_POST['debut']; (integer : 1-31)
$lettre = $_POST['lettre']; (integer)
$couleur = $_POST['couleur']; (integer)
$agent = $_POST['agent']; (string)
$code = $lettre + $couleur;
$rep = $bdd->exec("UPDATE z_agenda
SET AGENDA_1='$code'
WHERE AGENDA_NOM='$agent'
AND AGENDA_TYPE='code'");
"
my database contain few information columns, and 31 columns for each day. One line/month/user
don't know how manage my database with an other solution.

There's actually quite a lot going on here.
you should consider using prepared statements to prevent SQL injection vulnerabilities;
you should read up on database normalization;
you could expand the string and add the columns dynamically using, for example, a for loop but you don't want to do this!
Having numbered columns is usually a Very Bad Idea. Click the database normalization link for detailed information and thorough guidelines on how to proceed. Your application will get unmaintainable with a database structure like this. You'll be writing 'string building loops' for the rest of your life, whereas problems like the one you're having now have been solved a million times before.

The loop is static just for an example but you can set the inner code into your accordingly
for($i=0;$i<=31;$i++)
{
$agenda_coloumn = 'AGENDA_'.$i;
$rep = $bdd->exec("UPDATE z_agenda SET $agenda_coloumn = '$code' WHERE AGENDA_NOM='$agent' AND AGENDA_TYPE='code'");
}

How To Import Movielens Data To Mysql

How can i import UTF-8 data form Movielens to MySql.
I get the data from http://grouplens.org/datasets/movielens/ and for my recommender system Thesis purpose, i just want the 100K and Tag Gnome data only.
I've been searching on google and in this forum and i don't find anything about importing these files to MySQl. Myself, currently using PhpMyAdmin for managing MySQL, so if anybody know how to easily import those files to MySQL.
I'm fine if you guys recommend me to iterate it one by one using php, but please explain to me the code.

You'll need to write some custom code to import all of their data into MySQL. Dumbest answer on Stack Overflow ever, right?
So they provide a set of flat files, each described in the README.
README
allbut.pl
mku.sh
u.data
u.genre
u.info
u.item
u.occupation
u.user
u1.base
u1.test
u2.base
u2.test
u3.base
u3.test
u4.base
u4.test
u5.base
u5.test
ua.base
ua.test
ub.base
ub.test
In a nutshell:
Make your own database and tables in MySQL.
Programatically open a file and parse each line to SQL.
Import the SQL into MySQL.
???
Profit!
Yeah, I know I still haven't really told you anything, let's do one and you can hopefully do the others.
I'll do u.genre, because I'm lazy and it is easy.
Make a new table, I'll assume you know how to make tables and such.
u.genre has two things: a genre and an id.
unknown|0
Action|1
...etc...
So your table should have two fields.
You'll use two data types: https://dev.mysql.com/doc/refman/5.7/en/data-types.html
id - unsigned TINYINT
TINYINT unsigned is 0 to 255
genre - VARCHAR(20)
VARCHAR 20 is up to 20 characters, their longest is "Documentary" so that'll give you a bit of extra room if they add a new one.
Open the file get the contents: https://secure.php.net/manual/en/function.file-get-contents.php
$filecontents = file_get_contents("u.genre");
Now let's split up the file by line: https://secure.php.net/manual/en/function.explode.php
$genres = explode("\n", $filecontents);
Now we'll loop through the $genres using foreach and explode again: https://secure.php.net/manual/en/control-structures.foreach.php
foreach ($genres as &$row) {
list($genre,$id) = explode("|",$row);
# more here later
}
Now let's just output SQL, skipping if either of the fields are empty.
if ($genre!="" && $id!=="") {
print "INSERT INTO genre (genre,id) VALUES ($genre,$id);\n";
}
Put it all together...
<?php
$filecontents = file_get_contents("u.genre");
$genres = explode("\n", $filecontents);
foreach ($genres as &$row) {
list($genre,$id) = explode("|",$row);
if ($genre!="" && $id!=="") {
$sql = "INSERT INTO genre (genre,id) VALUES ($genre,$id);\n";
print $sql;
# Insert each into your DB here.
}
}
?>
Save it and run it from the commandline or put it in a browser for no good reason.
There are too many resources out there showing how to insert data into MySQL, so I'll leave it at this. Everyone's database setup is a bit different, so writing it up for my particular setup won't help you.

JSON from SQL Table get double result

first of all i have to tell you that it is my first step on php and JSON.
I decided to use JSON to get value from a customer SQL Table.
I get my results using this script :
mysql_connect($config['mysql_host'],$config['mysql_user'],$config['mysql_pass']);
//select database
#mysql_select_db($config['db_name']) or die( "Unable to select database");
mysql_query('SET CHARACTER SET utf8');
$fet=mysql_query('select * from vehicule');
$json = array();
while($r=mysql_fetch_array($fet)){
$json[] = $r;
}
header('Content-Type: application/json');
echo $json_data=json_encode($json);
Everything is ok, exept that my JSON results looks like :
0 = 462;
1 = "Hyundai ix20 crdi 115 panoramic sunsation";
10 = 1346450400;
11 = "462-Hyundai-ix20-crdi-115-panoramic-sunsation";
12 = 462;
...
id = 462;
kilometrage = 14400;
marque = 4;
modele = 137;
motorisation = 2;
ordre = 462;
prix = 17500;
puissance = 6;
titre = "Hyundai ix20 crdi 115 panoramic sunsation";
url = "462-Hyundai-ix20-crdi-115-panoramic-sunsation";
...
I have result of the table in 2 versions : one with 0:value, 1:value, 2... and the other one using the table key, how can i print only the second one ?
By the way can someone give me link so i can know by what i have to replace mysql which is think out of date ? (i'm a beginner few hours using PHP)
Thank you very much !

You have two different issues happening here. One is outright causing the issue you are seeing, and the other is a bad practice mistake that will leave you wide open for trouble in the long run.
The first issue is the one you're asking about. The mysql_fetch_array function (see the Docs here) expects a minimum of one input (the result input) that you are providing. It also has a second, optional input. That optional input defaults to MYSQL_BOTH, which returns an associative array with the results available both through keys (column names) and their indexes. Which is to say, that if you select the column 'id', you get it's value in both $array[0] and $array['id']. It's duplicated, and thus the JSON process carries over the duplication. You need to provide a second value to the function, either MYSQL_ASSOC to get $array['id'] or MYSQL_NUM to get $array[0].
Your second issue is the choice of functions. You're using the 'raw' mysql functions. These have been depreciated, which is a technical term that means 'these functions are no longer supported, but we've left them in to give you time to fix legacy code'. For legacy, read 'old'. Those functions will be going away soon, and you need to upgrade to a better option -- either the mysqli functions, or the PDO class. I strongly recommend the PDO class, as once you learn it it's easy to learn and has the advantage of being more portable. Whichever set you go with, you need to learn to use prepared statements as both a performance and security issue. Right at the moment, you're working with 'raw' statements which have a history of being very easy to interfere with via what's called an 'injection attack'. You can see a fictionalized example of such an attack here, and there are plenty of articles online about it. These attacks can be incredibly complex and difficult to fight, so using prepared statements (which handle it for you), is strongly recommended. In the specific example you're using here, you don't need to worry about it because you aren't including any user inputs, but it's an important habit to get into.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.