How different two users with the same IP and user agent? [duplicate] - php

I'm building an analytic tool and I can currently get the user's IP address, browser and operating system from their user agent.
I'm wondering if there is a possibility to detect the same user without using cookies or local storage? I'm not expecting code examples here; just a simple hint of where to look further.
Forgot to mention that it would need to be cross-browser compatible if it's the same computer/device. Basically I'm after device recognition not really the user.

Introduction
If I understand you correctly, you need to identify a user for whom you don't have a Unique Identifier, so you want to figure out who they are by matching Random Data. You can't store the user's identity reliably because:
Cookies Can be deleted
IP address Can change
Browser Can Change
Browser Cache may be deleted
A Java Applet or Com Object would have been an easy solution using a hash of hardware information, but these days people are so security-aware that it would be difficult to get people to install these kinds of programs on their system. This leaves you stuck with using Cookies and other, similar tools.
Cookies and other, similar tools
You might consider building a Data Profile, then using Probability tests to identify a Probable User. A profile useful for this can be generated by some combination of the following:
IP Address
Real IP Address
Proxy IP Address (users often use the same proxy repeatedly)
Cookies
HTTP Cookies
Session Cookies
3rd Party Cookies
Flash Cookies (most people don't know how to delete these)
Web Bugs (less reliable because bugs get fixed, but still useful)
PDF Bug
Flash Bug
Java Bug
Browsers
Click Tracking (many users visit the same series of pages on each visit)
Browsers Finger Print
  - Installed Plugins (people often have varied, somewhat unique sets of plugins)
Cached Images (people sometimes delete their cookies but leave cached images)
Using Blobs
URL(s) (browser history or cookies may contain unique user id's in URLs, such as https://stackoverflow.com/users/1226894 or http://www.facebook.com/barackobama?fref=ts)
System Fonts Detection (this is a little-known but often unique key signature)
HTML5 & Javascript
HTML5 LocalStorage
HTML5 Geolocation API and Reverse Geocoding
Architecture, OS Language, System Time, Screen Resolution, etc.
Network Information API
Battery Status API
The items I listed are, of course, just a few possible ways a user can be identified uniquely. There are many more.
With this set of Random Data elements to build a Data Profile from, what's next?
The next step is to develop some Fuzzy Logic, or, better yet, an Artificial Neural Network (which uses fuzzy logic). In either case, the idea is to train your system, and then combine its training with Bayesian Inference to increase the accuracy of your results.
The NeuralMesh library for PHP allows you to generate Artificial Neural Networks. To implement Bayesian Inference, check out the following links:
Implement Bayesian inference using PHP, Part 1
Implement Bayesian inference using PHP, Part 2
Implement Bayesian inference using PHP, Part 3
At this point, you may be thinking:
Why so much Math and Logic for a seemingly simple task?
Basically, because it is not a simple task. What you are trying to achieve is, in fact, Pure Probability. For example, given the following known users:
User1 = A + B + C + D + G + K
User2 = C + D + I + J + K + F
When you receive the following data:
B + C + E + G + F + K
The question which you are essentially asking is:
What is the probability that the received data (B + C + E + G + F + K) is actually User1 or User2? And which of those two matches is most probable?
In order to effectively answer this question, you need to understand Frequency vs Probability Format and why Joint Probability might be a better approach. The details are too much to get into here (which is why I'm giving you links), but a good example would be a Medical Diagnosis Wizard Application, which uses a combination of symptoms to identify possible diseases.
Think for a moment of the series of data points which comprise your Data Profile (B + C + E + G + F + K in the example above) as Symptoms, and Unknown Users as Diseases. By identifying the disease, you can further identify an appropriate treatment (treat this user as User1).
Obviously, a Disease for which we have identified more than 1 Symptom is easier to identify. In fact, the more Symptoms we can identify, the easier and more accurate our diagnosis is almost certain to be.
Are there any other alternatives?
Of course. As an alternative measure, you might create your own simple scoring algorithm, and base it on exact matches. This is not as efficient as probability, but may be simpler for you to implement.
As an example, consider this simple score chart:
+-------------------------+--------+------------+
| Property | Weight | Importance |
+-------------------------+--------+------------+
| Real IP address | 60 | 5 |
| Used proxy IP address | 40 | 4 |
| HTTP Cookies | 80 | 8 |
| Session Cookies | 80 | 6 |
| 3rd Party Cookies | 60 | 4 |
| Flash Cookies | 90 | 7 |
| PDF Bug | 20 | 1 |
| Flash Bug | 20 | 1 |
| Java Bug | 20 | 1 |
| Frequent Pages | 40 | 1 |
| Browsers Finger Print | 35 | 2 |
| Installed Plugins | 25 | 1 |
| Cached Images | 40 | 3 |
| URL | 60 | 4 |
| System Fonts Detection | 70 | 4 |
| Localstorage | 90 | 8 |
| Geolocation | 70 | 6 |
| AOLTR | 70 | 4 |
| Network Information API | 40 | 3 |
| Battery Status API | 20 | 1 |
+-------------------------+--------+------------+
For each piece of information which you can gather on a given request, award the associated score, then use Importance to resolve conflicts when scores are the same.
Proof of Concept
For a simple proof of concept, please take a look at Perceptron. Perceptron is a RNA Model that is generally used in pattern recognition applications. There is even an old PHP Class which implements it perfectly, but you would likely need to modify it for your purposes.
Despite being a great tool, Perceptron can still return multiple results (possible matches), so using a Score and Difference comparison is still useful to identify the best of those matches.
Assumptions
Store all possible information about each user (IP, cookies, etc.)
Where result is an exact match, increase score by 1
Where result is not an exact match, decrease score by 1
Expectation
Generate RNA labels
Generate random users emulating a database
Generate a single Unknown user
Generate Unknown user RNA and Values
The system will merge RNA information and teach the Perceptron
After training the Perceptron, the system will have a set of weightings
You can now test the Unknown user's pattern and the Perceptron will produce a result set.
Store all Positive matches
Sort the matches first by Score, then by Difference (as described above)
Output the two closest matches, or, if no matches are found, output empty results
Code for Proof of Concept
$features = array(
'Real IP address' => .5,
'Used proxy IP address' => .4,
'HTTP Cookies' => .9,
'Session Cookies' => .6,
'3rd Party Cookies' => .6,
'Flash Cookies' => .7,
'PDF Bug' => .2,
'Flash Bug' => .2,
'Java Bug' => .2,
'Frequent Pages' => .3,
'Browsers Finger Print' => .3,
'Installed Plugins' => .2,
'URL' => .5,
'Cached PNG' => .4,
'System Fonts Detection' => .6,
'Localstorage' => .8,
'Geolocation' => .6,
'AOLTR' => .4,
'Network Information API' => .3,
'Battery Status API' => .2
);
// Get RNA Lables
$labels = array();
$n = 1;
foreach ($features as $k => $v) {
$labels[$k] = "x" . $n;
$n ++;
}
// Create Users
$users = array();
for($i = 0, $name = "A"; $i < 5; $i ++, $name ++) {
$users[] = new Profile($name, $features);
}
// Generate Unknown User
$unknown = new Profile("Unknown", $features);
// Generate Unknown RNA
$unknownRNA = array(
0 => array("o" => 1),
1 => array("o" => - 1)
);
// Create RNA Values
foreach ($unknown->data as $item => $point) {
$unknownRNA[0][$labels[$item]] = $point;
$unknownRNA[1][$labels[$item]] = (- 1 * $point);
}
// Start Perception Class
$perceptron = new Perceptron();
// Train Results
$trainResult = $perceptron->train($unknownRNA, 1, 1);
// Find matches
foreach ($users as $name => &$profile) {
// Use shorter labels
$data = array_combine($labels, $profile->data);
if ($perceptron->testCase($data, $trainResult) == true) {
$score = $diff = 0;
// Determing the score and diffrennce
foreach ($unknown->data as $item => $found) {
if ($unknown->data[$item] === $profile->data[$item]) {
if ($profile->data[$item] > 0) {
$score += $features[$item];
} else {
$diff += $features[$item];
}
}
}
// Ser score and diff
$profile->setScore($score, $diff);
$matchs[] = $profile;
}
}
// Sort bases on score and Output
if (count($matchs) > 1) {
usort($matchs, function ($a, $b) {
// If score is the same use diffrence
if ($a->score == $b->score) {
// Lower the diffrence the better
return $a->diff == $b->diff ? 0 : ($a->diff > $b->diff ? 1 : - 1);
}
// The higher the score the better
return $a->score > $b->score ? - 1 : 1;
});
echo "<br />Possible Match ", implode(",", array_slice(array_map(function ($v) {
return sprintf(" %s (%0.4f|%0.4f) ", $v->name, $v->score,$v->diff);
}, $matchs), 0, 2));
} else {
echo "<br />No match Found ";
}
Output:
Possible Match D (0.7416|0.16853),C (0.5393|0.2809)
Print_r of "D":
echo "<pre>";
print_r($matchs[0]);
Profile Object(
[name] => D
[data] => Array (
[Real IP address] => -1
[Used proxy IP address] => -1
[HTTP Cookies] => 1
[Session Cookies] => 1
[3rd Party Cookies] => 1
[Flash Cookies] => 1
[PDF Bug] => 1
[Flash Bug] => 1
[Java Bug] => -1
[Frequent Pages] => 1
[Browsers Finger Print] => -1
[Installed Plugins] => 1
[URL] => -1
[Cached PNG] => 1
[System Fonts Detection] => 1
[Localstorage] => -1
[Geolocation] => -1
[AOLTR] => 1
[Network Information API] => -1
[Battery Status API] => -1
)
[score] => 0.74157303370787
[diff] => 0.1685393258427
[base] => 8.9
)
If Debug = true you would be able to see Input (Sensor & Desired), Initial Weights, Output (Sensor, Sum, Network), Error, Correction and Final Weights.
+----+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+-----+----+---------+---------+---------+---------+---------+---------+---------+---------+---------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----------+
| o | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 | x11 | x12 | x13 | x14 | x15 | x16 | x17 | x18 | x19 | x20 | Bias | Yin | Y | deltaW1 | deltaW2 | deltaW3 | deltaW4 | deltaW5 | deltaW6 | deltaW7 | deltaW8 | deltaW9 | deltaW10 | deltaW11 | deltaW12 | deltaW13 | deltaW14 | deltaW15 | deltaW16 | deltaW17 | deltaW18 | deltaW19 | deltaW20 | W1 | W2 | W3 | W4 | W5 | W6 | W7 | W8 | W9 | W10 | W11 | W12 | W13 | W14 | W15 | W16 | W17 | W18 | W19 | W20 | deltaBias |
+----+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+-----+----+---------+---------+---------+---------+---------+---------+---------+---------+---------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----------+
| 1 | 1 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 0 | -1 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 |
| -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | -1 | -1 | 1 | -19 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 |
| -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
| 1 | 1 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 19 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 |
| -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | -1 | -1 | 1 | -19 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 |
| -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
+----+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+-----+----+---------+---------+---------+---------+---------+---------+---------+---------+---------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----------+
x1 to x20 represent the features converted by the code.
// Get RNA Labels
$labels = array();
$n = 1;
foreach ( $features as $k => $v ) {
$labels[$k] = "x" . $n;
$n ++;
}
Here is an online demo
Class Used:
class Profile {
public $name, $data = array(), $score, $diff, $base;
function __construct($name, array $importance) {
$values = array(-1, 1); // Perception values
$this->name = $name;
foreach ($importance as $item => $point) {
// Generate Random true/false for real Items
$this->data[$item] = $values[mt_rand(0, 1)];
}
$this->base = array_sum($importance);
}
public function setScore($score, $diff) {
$this->score = $score / $this->base;
$this->diff = $diff / $this->base;
}
}
Modified Perceptron Class
class Perceptron {
private $w = array();
private $dw = array();
public $debug = false;
private function initialize($colums) {
// Initialize perceptron vars
for($i = 1; $i <= $colums; $i ++) {
// weighting vars
$this->w[$i] = 0;
$this->dw[$i] = 0;
}
}
function train($input, $alpha, $teta) {
$colums = count($input[0]) - 1;
$weightCache = array_fill(1, $colums, 0);
$checkpoints = array();
$keepTrainning = true;
// Initialize RNA vars
$this->initialize(count($input[0]) - 1);
$just_started = true;
$totalRun = 0;
$yin = 0;
// Trains RNA until it gets stable
while ($keepTrainning == true) {
// Sweeps each row of the input subject
foreach ($input as $row_counter => $row_data) {
// Finds out the number of columns the input has
$n_columns = count($row_data) - 1;
// Calculates Yin
$yin = 0;
for($i = 1; $i <= $n_columns; $i ++) {
$yin += $row_data["x" . $i] * $weightCache[$i];
}
// Calculates Real Output
$Y = ($yin <= 1) ? - 1 : 1;
// Sweeps columns ...
$checkpoints[$row_counter] = 0;
for($i = 1; $i <= $n_columns; $i ++) {
/** DELTAS **/
// Is it the first row?
if ($just_started == true) {
$this->dw[$i] = $weightCache[$i];
$just_started = false;
// Found desired output?
} elseif ($Y == $row_data["o"]) {
$this->dw[$i] = 0;
// Calculates Delta Ws
} else {
$this->dw[$i] = $row_data["x" . $i] * $row_data["o"];
}
/** WEIGHTS **/
// Calculate Weights
$this->w[$i] = $this->dw[$i] + $weightCache[$i];
$weightCache[$i] = $this->w[$i];
/** CHECK-POINT **/
$checkpoints[$row_counter] += $this->w[$i];
} // END - for
foreach ($this->w as $index => $w_item) {
$debug_w["W" . $index] = $w_item;
$debug_dw["deltaW" . $index] = $this->dw[$index];
}
// Special for script debugging
$debug_vars[] = array_merge($row_data, array(
"Bias" => 1,
"Yin" => $yin,
"Y" => $Y
), $debug_dw, $debug_w, array(
"deltaBias" => 1
));
} // END - foreach
// Special for script debugging
$empty_data_row = array();
for($i = 1; $i <= $n_columns; $i ++) {
$empty_data_row["x" . $i] = "--";
$empty_data_row["W" . $i] = "--";
$empty_data_row["deltaW" . $i] = "--";
}
$debug_vars[] = array_merge($empty_data_row, array(
"o" => "--",
"Bias" => "--",
"Yin" => "--",
"Y" => "--",
"deltaBias" => "--"
));
// Counts training times
$totalRun ++;
// Now checks if the RNA is stable already
$referer_value = end($checkpoints);
// if all rows match the desired output ...
$sum = array_sum($checkpoints);
$n_rows = count($checkpoints);
if ($totalRun > 1 && ($sum / $n_rows) == $referer_value) {
$keepTrainning = false;
}
} // END - while
// Prepares the final result
$result = array();
for($i = 1; $i <= $n_columns; $i ++) {
$result["w" . $i] = $this->w[$i];
}
$this->debug($this->print_html_table($debug_vars));
return $result;
} // END - train
function testCase($input, $results) {
// Sweeps input columns
$result = 0;
$i = 1;
foreach ($input as $column_value) {
// Calculates teste Y
$result += $results["w" . $i] * $column_value;
$i ++;
}
// Checks in each class the test fits
return ($result > 0) ? true : false;
} // END - test_class
// Returns the html code of a html table base on a hash array
function print_html_table($array) {
$html = "";
$inner_html = "";
$table_header_composed = false;
$table_header = array();
// Builds table contents
foreach ($array as $array_item) {
$inner_html .= "<tr>\n";
foreach ( $array_item as $array_col_label => $array_col ) {
$inner_html .= "<td>\n";
$inner_html .= $array_col;
$inner_html .= "</td>\n";
if ($table_header_composed == false) {
$table_header[] = $array_col_label;
}
}
$table_header_composed = true;
$inner_html .= "</tr>\n";
}
// Builds full table
$html = "<table border=1>\n";
$html .= "<tr>\n";
foreach ($table_header as $table_header_item) {
$html .= "<td>\n";
$html .= "<b>" . $table_header_item . "</b>";
$html .= "</td>\n";
}
$html .= "</tr>\n";
$html .= $inner_html . "</table>";
return $html;
} // END - print_html_table
// Debug function
function debug($message) {
if ($this->debug == true) {
echo "<b>DEBUG:</b> $message";
}
} // END - debug
} // END - class
Conclusion
Identifying a user without a Unique Identifier is not a straight-forward or simple task. it is dependent upon gathering a sufficient amount of Random Data which you are able to gather from the user by a variety of methods.
Even if you choose not to use an Artificial Neural Network, I suggest at least using a Simple Probability Matrix with priorities and likelihoods - and I hope the code and examples provided above give you enough to go on.

This technique (to detect same users without cookies - or even without ip address) is called browser fingerprinting. Basically you crawl as information about the browser as you can - better results can be achieved with javascript, flash or java (f.ex. installed extensions, fonts, etc.). After that, you can store the results hashed, if you want.
It's not infallible, but:
83.6% of the browsers seen had a unique fingerprint; among those with Flash or Java enabled, 94.2%. This does not include cookies!
More info:
https://panopticlick.eff.org/
https://wiki.mozilla.org/Fingerprinting
https://www.browserleaks.com/

The above mentioned thumbprinting works, but can still suffer colisions.
One way is to add UID to the url of each interaction with the user.
http://someplace.com/12899823/user/profile
Where every link in the site is adapted with this modifier. It is similar to the way ASP.Net used to work using FORM data between pages.

Have you looked into Evercookie?
It may or may not work across browsers. An extract from their site.
"If a user gets cookied on one browser and switches to another browser,
as long as they still have the Local Shared Object cookie, the cookie
will reproduce in both browsers."

You could do this with a cached png, it would be somewhat unreliable (different browsers behave differently, and it'll fail if the user clears their cache), but it's an option.
1: set up a Database that stores a unique user id as a hex string
2: create a genUser.php (or whatever language) file that generates a user id, stores it in the DB and then creates a true color .png out of the values of that hex string (each pixel will be 4 bytes) and return that to the browser. Be sure to set the content-type and cache headers.
3: in the HTML or JS create an image like <img id='user_id' src='genUser.php' />
4: draw that image to a canvas ctx.drawImage(document.getElementById('user_id'), 0, 0);
5: read the bytes of that image out using ctx.getImageData, and convert the integers to a hex string.
6: That is your unique user id that's now cached on the your users computer.

You can do it with etags. Although I am not sure if this legal as a bunch of lawsuits were filed.
If you properly warn your users or if you have something like an intranet website it might be ok.

You could potentially create a blob to store a device identifier ...
the downside is that the user needs to download the blob ( you can force the download ),
as the browser can't access the File System to directly save the file.
reference:
https://www.inkling.com/read/javascript-definitive-guide-david-flanagan-6th/chapter-22/blobs

Based on what you have said :
Basically I'm after device recognition not really the user
Best way to do it is to send the mac address which is the NIC ID.
You can take a look at this post :
How can I get the MAC and the IP address of a connected client in PHP?

Inefficient, but may give you the desired results, would be to poll an API on your side. Have a background process on the client side which sends user data at an interval. You will need a user identifier to send to your API. Once you have that you can send along any information associated to that unique identifier.
This removes the need for cookies and localstorage.

I can't believe, http://browserspy.dk still has not been mentioned here!
The site describes many features (in terms of pattern recognition), which could be used to build a classifier.
And of cause, for evaluating the features I'd suggest Support Vector Machines and libsvm in particular.

Track them during a session or across sessions?
If your site is HTTPS Everywhere you could use the TLS Session ID to track the user's session

create a cross-platform dummy (nsapi)plugin and generate a unique name for the plugin name or version when the user downloads it (eg after login).
provide a installer for the plugin / install it per policy
this will require the user to willingly install the identifier.
once the plugin is installed, the fingerprint of any (plugin enabled) browser will contain this specific plugin. To return the info to a server, a algorithm to effectively detect the plugin on client-side is needed, otherwise IE and Firefox >= 28 users will need a table of possible valid identifies.
This requires a relatively high investment into a technology that will likely be shut down by the browser-vendors. When you are able to convince your users to install a plugin, there may also be options like install a local proxy, use vpn or patch the network drivers.
Users that do not want to be identified (or their machines) will always find a way to prevent it.

Related

Reorder mysql table ROWS on front end, and update backend [duplicate]

I have a table of food items. They have a "Position" field that represents the order they should appear in on a list (listID is the list they are on, we don't want to re-order items on another list).
+--id--+--listID--+---name---+--position--+
| 1 | 1 | cheese | 0 |
| 2 | 1 | chips | 1 |
| 3 | 1 | bacon | 2 |
| 4 | 1 | apples | 3 |
| 5 | 1 | pears | 4 |
| 6 | 1 | pie | 5 |
| 7 | 2 | carrots | 0 |
| 8,9+ | 3,4+ | ... | ... |
+------+----------+----------+------------+
I want to be able to say "Move Pears to before Chips" which involves setting the position of Pears to position 1, and then incrementing all the positions inbetween by 1. so that my resulting Table look like this...
+--id--+--listID--+---name---+--position--+
| 1 | 1 | cheese | 0 |
| 2 | 1 | chips | 2 |
| 3 | 1 | bacon | 3 |
| 4 | 1 | apples | 4 |
| 5 | 1 | pears | 1 |
| 6 | 1 | pie | 5 |
| 7 | 2 | carrots | 0 |
| 8,9+ | 3,4+ | ... | ... |
+------+----------+----------+------------+
So that all I need to do is SELECT name FROM mytable WHERE listID = 1 ORDER BY position and I'll get all my food in the right order.
Is it possible to do this with a single query? Keep in mind that a record might be moving up or down in the list, and that the table contains records for multiple lists, so we need to isolate the listID.
My knowledge of SQL is pretty limited so right now the only way I know of to do this is to SELECT id, position FROM mytable WHERE listID = 1 AND position BETWEEN 1 AND 5 then I can use Javascript (node.js) to change position 5 to 1, and increment all others +1. Then UPDATE all the records I just changed.
It's just that anytime I try to read up on SQL stuff everyone keeps saying to avoid multiple queries and avoid doing syncronous coding and stuff like that.
Thanks
This calls for a complex query that updates many records. But a small change to your data can change things so that it can be achieved with a simple query that modifies just one record.
UPDATE my_table set position = position*10;
In the old days, the BASIC programming language on many systems had line numbers, it encouraged spagetti code. Instead of functions many people wrote GOTO line_number. Real trouble arose if you numbered the lines sequentially and had to add or delete a few lines. How did people get around it? By increment lines by 10! That's what we are doing here.
So you want pears to be the second item?
UPDATE my_table set position = 15 WHERE listId=1 AND name = 'Pears'
Worried that eventually gaps between the items will disappear after multiple reordering? No fear just do
UPDATE my_table set position = position*10;
From time to time.
I do not think this can be conveniently done in less than two queries, which is OK, there should be as few queries as possible, but not at any cost. The two queries would be like (based on what you write yourself)
UPDATE mytable SET position = 1 WHERE listID = 1 AND name = 'pears';
UPDATE mytable SET position = position + 1 WHERE listID = 1 AND position BETWEEN 2 AND 4;
I've mostly figured out my problem. So I've decided to put an answer here incase anyone finds it helpful.
I can make use of a CASE statement in SQL. Also by using Javascript beforehand to build my SQL query I can change multiple records.
This builds my SQL query:
var sql;
var incrementDirection = (startPos > endPos)? 1 : -1;
sql = "UPDATE mytable SET position = CASE WHEN position = "+startPos+" THEN "+endPos;
for(var i=endPos; i!=startPos; i+=incrementDirection){
sql += " WHEN position = "+i+" THEN "+(i+incrementDirection);
}
sql += " ELSE position END WHERE listID = "+listID;
If I want to move Pears to before Chips. I can set:
startPos = 4;
endPos = 1;
listID = 1;
My code will produce an SQL statement that looks like:
UPDATE mytable
SET position = CASE
WHEN position = 4 THEN 1
WHEN position = 1 THEN 2
WHEN position = 2 THEN 3
WHEN position = 3 THEN 4
ELSE position
END
WHERE listID = 1
I run that code and my final table will look like:
+--id--+--listID--+---name---+--position--+
| 1 | 1 | cheese | 0 |
| 2 | 1 | chips | 2 |
| 3 | 1 | bacon | 3 |
| 4 | 1 | apples | 4 |
| 5 | 1 | pears | 1 |
| 6 | 1 | pie | 5 |
| 7 | 2 | carrots | 0 |
| 8,9+ | 3,4+ | ... | ... |
+------+----------+----------+------------+
After that, all I have to do is run SELECT name FROM mytable WHERE listID = 1 ORDER BY position and the output will be as follows::
cheese
pears
chips
bacon
apples
pie

Select highest percent first php sql

I have the following sql which selects the most recurring row first based on the column "reported"
$datan = mysql_query("
SELECT *, COUNT(reported) AS ct
FROM profile_reports
WHERE open = '1'
GROUP BY reported
ORDER BY ct DESC
LIMIT 1
") or die(mysql_error());
I want my sql to also check which 'reporter' (each is a number associated with a user) has the best percentage of useful reports, which is determined this way:
((raction > 0 AND raction < 99 AND open = '0' AND reporter = 'reporter') / (reporter = 'reporter' AND open = '0')) * 100
...and show the rows with highest percentage first. It's a little tricky because no initial reporter is set.
Here's a sample table:
+----+----------+----------+-------+----------+
| id | reporter | reported | open | raction |
+----+----------+----------+-------+----------+
| 1 | 24 | 26 | 0 | 3 |
| 2 | 24 | 23 | 0 | 0 |
| 3 | 24 | 29 | 1 | |
| 4 | 12 | 29 | 0 | 4 |
| 5 | 12 | 29 | 1 | |
| 6 | 24 | 21 | 1 | 0 |
+----+----------+----------+-------+----------+
I want it to see that there are more reports about user 29(column: reported), then check which reporting user(column: reporter) has the best percentage (based on the line of code above), in this case user 12, and display their report
Its actually pretty easy in just take the sums of your conditions and divide. In order to get the "Reported" correctly you'll need to use an inline view to find the highest report.
SELECT pr.*,
( Sum(pr.raction > 0
AND pr.raction < 99
AND pr.open = '0'
AND pr.reported = t.reported) / Sum(pr.reported = t.reported
AND pr.open = '0') ) * 100 AS
usefull
FROM profile_reports pr,
(SELECT reported
FROM profile_reports
WHERE open = '1'
GROUP BY reported
ORDER BY Count(reported) DESC
LIMIT 1) t
GROUP BY reporter
ORDER BY usefull DESC
LIMIT 1
demo
Output
| ID | REPORTER | REPORTED | OPEN | RACTION | USEFULL |
-------------------------------------------------------
| 4 | 12 | 29 | 0 | 4 | 100 |
I haven't done everything for you. You will have to decide what to do if the divisor is zero
Note in just about everything but MySQL you would need to use CASE
SUM ( CASE WHEN raction > 0 AND .... THEN 1 ELSE 0 END) / ....

How to tabulate this javascript array in a mysql database? What index should I use?

My question is about tabulating data in MySql. I was wondering, how to best represent this javascript array in MySql? What index should I use? I'm going to use the data to populate a javascript array via PHP.
A[i] represents a card. B[i] represents a matching card.
A = new Array();
A[0] = new Array();
A[0][0]='eat';
A[0][1] = 1;
A[0][2] = 0;
A[1] = new Array();
A[1][0]='drink';
A[1][1] = 2;
A[1][2] = 0;
B = new Array();
B[0] = new Array();
B[0][0]='tacos';
B[0][1] = 1;
B[0][2] = 0;
B[1] = new Array();
B[1][0]='tequila';
B[1][1] = 2;
B[1][2] = 0;
I need to be able to uniquely identify components within the array later, so that I can use parts of the data to populate new arrays (So I can use and combine different cards into a new array). For example, I might want to populate a new array in javascript using A[0][0], A[0][1], A[0][2],B[0][0], B[0][1] and info from another array stored in the MySql (Lets say Y[2][0], Y[2][1],Y[2][2],Z[2][0], Z[2][1]).
This is what I've come up with so far.
-----------------------------------------
| card pair | card |card info|Tag|Tag2|
-----------------------------------------
| 1 | A | eat | 1 | 0 |
| 1 | B | tacos | 1 | - |
| 2 | A | drink | 2 | 0 |
| 2 | B | tequila | 2 | - |
-----------------------------------------
Maybe I need to add a primary index to the above one?
-------------------------------
|card pair |card info|Tag|Tag2|
-------------------------------
| 1A | eat | 1 | 0 |
| 1B | tacos | 1 | - |
| 2A | drink | 2 | 0 |
| 2B | tequila | 2 | - |
-------------------------------
I thought the card pair could be the index. Not sure if this is possible or a good idea. Also not sure what type of index I would use if I did.
If you have a better way to tabulate the data or can recommend what type of index to use I'd much appreciate it.
EDIT: I think I can do away with the last 2 columns (Tag and Tag2), so I think I might just use the table as below.
----------------------
|card pair |card info|
----------------------
| 1A | eat |
| 1B | tacos |
| 2A | drink |
| 2B | tequila |
----------------------
Should I add an incrementing index to the table? Is the card pair sufficient as the index?If yes, what is the best index type to use?
Thanks!
well, from a database perspective, you will want to 'normalize' this information.
I think it would be more like this:
card
------------
card_id
info
card_pair
------------
card_1_id
card_2_id

MySQL select data based on months and sum them together (preferably in ZEND)

I have a fairly complicated MySQL query that I need help implementing in Zend Framework. I have a database named 'power' that is structured as follows:
id | addr | timestamp | power1 | power2 | serial
21 | FAS235DQ92F6C110 | 2011-11-08 22:51:55 | 4.25698 | 2.0189 | DEADBEEF
22 | FAS235DQ92F6C110 | 2011-11-09 22:53:05 | 0 | 1.0568 | DEADBEEF
23 | FAS235DQ92F6C110 | 2011-11-10 22:51:55 | 4.25698 | 2.0189 | DEADBEEF
24 | FAS235DQ92F6C110 | 2011-11-11 22:53:05 | 0 | 1.0568 | DEADBEEF
33 | A1B2C3D4E5F67890 | 2011-11-20 14:51:25 | 19.123 | 2.9765 | DEADBEEF
34 | A1B2C3D4E5F67890 | 2011-11-21 14:51:54 | 1.90876 | 12.123 | DEADBEEF
35 | A1B2C3D4E5F67890 | 2011-11-22 14:51:25 | 19.123 | 2.9765 | DEADBEEF
36 | A1B2C3D4E5F67890 | 2011-11-23 14:51:54 | 1.90876 | 12.123 | DEADBEEF
I would like to do the following in a SQL statement, preferably using the Zend DB functions, but that is not required:
Based on 'serial' and 'addr', I want to add all of the power1's and power2's up for a month. So, in this table, I would want a query that returns a row of size 2 (one for each of 'power1' and 'power2'). If I look at the row for "November and 'serial'=DEADBEEF and 'addr'= FAS235DQ92F6C110", I want power1Sum to be ~8.5 and power2Sum to be ~6.2.
Does anyone know how to make this query in pure MySQL code or in the Zend framework?
Thanks so much!
I think something like this may help you for Zend Framework. This returns one row with the totals for the selected period.
Assume $addr = 'FAS235DQ92F6C110' and $serial = 'DEADBEEF', this returns the totals for November 1 - 30, 2011.
$select = $table->select()
->from('power', array(
'power1Sum' => 'SUM(power1)',
'power2Sum' => 'SUM(power2)'))
->where('serial = ?', $serial)
->where('addr = ?', $addr)
->where('timestamp >= ?', '2011-11-01')
->where('timestamp <= ?', '2011-11-30');
$result = $select->query();
if ($result) {
$row = $result->fetch();
echo 'Power 1 Sum = ' . $row['power1Sum'] . '<br />';
echo 'Power 2 Sum = ' . $row['power2Sum'] . '<br />';
} else {
echo "No results found.";
}

Update with an incremented value based on corresponding row

Here's a problem for the PHP-juggler in you.
I want to use plain-ol' mysql_* functions of PHP.
I have the following MySQL table:
+-----+-----------------+
| id | thread |
+-----+-----------------+
| 14 | 01/ |
| 14 | 02/ |
| 14 | 03/ |
| 15 | 01/ |
| 22 | 01/ |
| 24 | XXX |
| 24 | XXX |
| 24 | XXX |
| 24 | XXX |
| 32 | XXX |
| 32 | XXX |
+-----+-----------------+
The "XXX" values are my making. I want to change (UPDATE) that table to this one:
+-----+-----------------+
| id | thread |
+-----+-----------------+
| 14 | 01/ <- |
| 14 | 02/ |
| 14 | 03/ |
| 15 | 01/ <- |
| 22 | 01/ <- |
| 24 | 01/ <- |
| 24 | 02/ |
| 24 | 03/ |
| 24 | 04/ |
| 32 | 01/ <- |
| 32 | 02/ |
+-----+-----------------+
On every new value of the "id" field (where the "<-" is; my making, also), the "thread" field value has to reset itself to "01/" and continue incrementing until a new value of "id" is found.
I've tried querying with COUNT(id) to increment somehow. I tried storing in arrays. I thought of mysql_data_seek() also. Alas, I don't seem to pull it off.
I got the "thread" format right, though:
$thread = $i < 10 ? "0$i" : $i;
So, if it's bigger than 10, it doesn't get a leading zero. But this is just the fun part.
Any help would be appreciated.
Thanks
SET #oldid = 0;
SET #counter = 0;
UPDATE tablename
SET thread = CONCAT(
LPAD(
CAST(IF(id = #oldid,
#counter := #counter + 1, -- same id, increment
#counter := (#oldid := id)/id) -- other id, set to 1
AS UNSIGNED),
2,'0'), -- pad with '0'
'/') -- append '/'
WHERE thread = 'XXX' -- or enumerate the whole thing if need be
ORDER BY id, thread;
Which can just be fed to "plain ol' mysql_query" (3 in a row: feed the SET & UPDATE queries separately, alternatively forget about SETting anything, I just hate uninitialized variables ;)
Set PRIMARY KEY the tuple (id,thread) and set thread (but not the id!) as AUTO_INCREMENT, then
run the query
INSERT INTO mytable (id) VALUES (24),(24),(32),(24),(32),(24)
and thread attribute should be set autoincrementally. If you insist on "0n/" form, I suggest to create thread_string attribute and create BEFORE UPDATE trigger according to NEW.thread attribute.
Does it work?

Categories