My input values are 1, 2, 3, 4, ... and my output values are 1*1, 2*2, 3*3, 4*4, ...
My code looks like this:
$reg = new LeastSquares();
$samples = array();
$targets = array();
for ($i = 1; $i < 100; $i++)
{
$samples[] = [$i];
$targets[] = $i*$i;
}
$reg->train($samples, $targets);
echo $reg->predict([5])."\n";
echo $reg->predict([10])."\n";
I expect it to output roughly 25 and 100. But I get:
-1183.3333333333
-683.33333333333
I also tried to use SVR instead of LeastSquares but the values are strange too:
2498.23
2498.23
I am new to ML. What am I doing wrong?
As others have pointed out in the comments LeastSquares is for fitting a linear model to your data (training examples).
Your data set (target = samples^2) is inherently non-linear. If you try to picture what happens when you fit the best possible (in a least square of residuals sense) line to a quadratic curve you get a negative y-intercept (a sketch of this below):
You've trained your linear model on data up to x=99, y=9801, which will mean you have a very large y-intercept. So down at x=5 or x=10 you end up with a large negative value as you've found.
If you use support vector regression with a degree-2 polynomial it will do a good job of capturing the pattern of your data:
<?php
require_once __DIR__ . '/vendor/autoload.php';
use Phpml\Regression\SVR;
use Phpml\SupportVectorMachine\Kernel;
$samples = array();
$targets = array();
for ($i = 1; $i <= 100; $i++)
{
$samples[] = [$i];
$targets[] = $i*$i;
}
$reg = new SVR(Kernel::POLYNOMIAL, $degree = 2);
$reg->train($samples, $targets);
echo $reg->predict([5])."\n";
echo $reg->predict([10])."\n";
?>
Returns:
25.0995
100.098
From your response in the comments its clear that you're looking to apply a neural network so that you don't have to worry about what degree of model to fit to your data. A neural network with a single hidden layer can fit any continuous function arbitrarily well with enough hidden nodes, and enough training data.
Unfortunately php-ml doesn't seem to have a MLP (multilayer perceptron - another term for a neural network) for regression available out-of-the-box. I'm sure you could build one from appropriate layers but if your goal is to get up and running with training regression models quickly it might not be the best approach.
Related
I have a little project going and since I am not the best at OOP I have come here for some help.
I have a class called "truckinfo"
class truckinfo
{
public $vendor;
public $truckcount;
public $plate;
}
now I get the information like plate or vendor from somewhere else, and then display the truck in a list.
I plan on creating the list somewhat like this
for($i = 0; $i < 3; $i++){
$truck = new truckinfo();
$truck->plate = "abc" . $i;
$truck->truckcount = $i + 5;
echo $truck->truckcount;
}
My question is, is it possible after displaying all the trucks to get the information from the first truck (in this case truck with plate "abc0") or can I only get the info for the latest one that got into the loop?
Thank you in advance.
With this small piece of logic you have no variable that holds the reference to the first truck object, so there is (maybe by some trickery I'm unaware of) no way of obtaining that object. Often in basic courses you get introduced to so called 'management' classes that construct objects of a kind and keep track of them in an array (or other data structure), allowing you to search for them etc. Especially if you start working towards a small basic CRUD application it is nice to have that responsibility grouped in a separate class.
This way you can manage identifications of objects as well as more cleanly get and return the wanted object that uniquely identifies it.
Let me add that I assume for your example storing them in the associative array as: $trucks[$truck->plate] = $truck creating yourself a small mapping of license plates to truck info objects would suffice. You can then directly access and retrieve them by the license plate.
$trucks = [];
for($i = 0; $i < 3; $i++){
$truck = new truckinfo();
$truck->plate = "abc" . $i;
$truck->truckcount = $i + 5;
echo $truck->truckcount;
$trucks[$truck->plate] = $truck
}
echo $trucks['abc0']->truckcount;
You can simply create as many instances of your object as you like. E.g. you could save them in an array for later direct access.
$trucks = array();
for($i = 0; $i < 3; $i++){
$trucks[$i] = new truckinfo();
$trucks[$i]->plate = "abc" . $i;
$trucks[$i]->truckcount = $i + 5;
echo $trucks[$i]->truckcount;
}
echo $trucks[0]->truckcount; //example
I'm working on a 'thought' function for a game i'm working on -- it pulls random strings from an XML file, combines them and makes them 'funny'. However, i'm running into a small issue in that the same couple of items keep getting selected each time.
The two functions I am using are
function randRoller($number)
{
mt_srand((microtime()*time())/3.145);
$x = [];
for($i = 0; $i < 100; $i++)
{
#$x = mt_rand(0,$number);
}
return mt_rand(0,$number);
}
/* RETRIEVE ALL RELEVANT DATA FROM THE XML FILE */
function retrieveFromXML($node)
{
$node = strtolower($node);
$output = [];
$n = substr($node,0,4);
#echo $node;
foreach($this->xml->$node->$n as $data)
{
$output[] = $data->attributes();
}
$count = count($output)-1;
$number = $this->randRoller($count);
return $output[$number];
}
Granted, the "randRoller" function is sorta defunct now because the orginal version I had (Which 'rolled' ten numbers from the count, and then selected the one which got the most number of dice) didn't work as planned.
I've tried everything i can think of to get better results && have googled my brains out to fix it. but still am getting the same repetitive results.
Don't use mt_srand() unless you know what you are doing, since it is called automatically. See the note on http://php.net/manual/en/function.mt-srand.php:
Note: There is no need to seed the random number generator with srand() or mt_srand() as this is done automatically.
Remove (all) the mt_srand() call(s).
So crazy! I have a bug that's 100% reproducible, it happens in only a few lines of code, yet I cannot for the life of me determine what the problem is.
My project is a workout maker, and the mystery involves two functions:
get_pairings: It makes a set of $together_pairs (easy) and $mixed_pairs (annoying), and combines them into $all_pairs, used to make the workout.
make_mixed_pairs: this has different logic depending on whether it's a partner vs solo workout. Both cases return a set of $mixed_pairs (in the same exact format), called by the function above.
The symptoms/clues:
The case of the solo workout is fine, $all_pairs will only contain $mixed_pairs (because as it's defined, $together_pairs are only for partner workouts)
In the case of the a partner workout, when I combine the two sets in get_pairings(), $all_pairs only successfully gets the first set I give it! (If I swap those lines at step 2 and add $together_pairs first, $all_pairs contains only those. If I do $mixed_pairs first, $all_pairs contains only that).
Then if I uncomment that second-to-last line in make_mixed_pairs() just for troubleshooting to see what happens, then $all_pairs does successfully include exercises from both sets!
That suggests the problem is something I'm doing wrong in making the arrays in make_mixed_pairs(), but I confirmed that the resulting format is identical in both cases.
Anyone see what else I could be missing? I've been narrowing down this bug for 4 hours so far- I can't make it any smaller, and I can't see what's wrong :(
Update: I updated the for loop in make_mixed_pairs() to stop at $mixed_pair_count - 1 (instead of just $mixed_pair_count), and now I sometimes get one single 'together_pair' mixed in the $all_pairs results; the same damn one each time, weirdly. Though it's not 'fixed', because again when I change the order that I add the two sets in get_pairings, when I add $together_pairs first, then $all_pairs is ENTIRELY those- it's so strange...
Here are the functions: first get_pairings (relevant part is right before and after step 2):
/**
* Used in make_workout.php: take the user's available resources, and return valid exercises
*/
function get_pairings($exercises, $count, $outdoor_partner_workout)
{
// 1. Prep our variables, and put exercises into the appropriate buckets
$mixed_exercises = array();
$together_pairs = array();
$mixed_pairs = array();
$all_pairs = array();
$selected_pairs = array();
// Sort the valid exercises: self_pairing exercises go as they are, with extra
// array for consistent formatting. Mixed ones go into $mixed_exercises array
// for more specialized pairing in make_mixed_pairs
foreach($exercises as $exercise)
{
if ($exercise['self_pairing'])
{
$pair = array($exercise);
array_push($together_pairs, [$pair]);
}
else
{
$this_exercise = array($exercise);
array_push($mixed_exercises, $this_exercise);
}
}
// Now get the mixed_pairs
$mixed_pairs = make_mixed_pairs($mixed_exercises, $outdoor_partner_workout);
// 2. combine together into one set, and select random pairs for the workout
// Add both sets to the array of all pairs (to pick from afterward)
$all_pairs += $mixed_pairs;
$all_pairs += $together_pairs;
// Now let's choose at random our desired # of pairs, and save them in $selected_pairs
$pairing_keys = array_rand($all_pairs, $count);
foreach($pairing_keys as $key)
{
array_push($selected_pairs, $all_pairs[$key]);
}
// Finally, shuffle it so we don't always see the self-pairs first
shuffle($selected_pairs);
return $selected_pairs;
}
And the other one- make_mixed_pairs: there are two cases, the first is complicated (and shows the bug) and the second is simple (and works):
/**
* Used by get_pairings: in case of a partner workout that has open space (where
* one person can travel to a point while the other does an exercise til they return)
* we'll pair exercises in a special way. (If not, fine to grab random pairs)
*/
function make_mixed_pairs($mixed_exercises, $outdoor_partner_workout)
{
$mixed_pairs = array();
// When it's an outdoor partner workout, we want to pair travelling with stationary
// put them into arrays and then we'll make pairs using one from each
if ($outdoor_partner_workout)
{
$mixed_travelling = array();
$mixed_stationary = array();
foreach($mixed_exercises as $exercise)
{
if ($exercise[0]['travelling'])
{
array_push($mixed_travelling, $exercise);
}
else
{
array_push($mixed_stationary, $exercise);
}
}
shuffle($mixed_travelling);
shuffle($mixed_stationary);
// determine the smaller set, and pair exercises that many times
$mixed_pair_count = min(count($mixed_travelling), count($mixed_stationary));
for ($i=0; $i < $mixed_pair_count; $i++)
{
$this_pair = array($mixed_travelling[$i], $mixed_stationary[$i]);
array_push($mixed_pairs, $this_pair); // problem is adding them here- we get only self_pairs
}
}
// Otherwise we can just grab pairs from mixed_exercises
else
{
// shuffle the array so it's in random order, then chunk it into pairs
shuffle($mixed_exercises);
$mixed_pairs = array_chunk($mixed_exercises, 2);
}
// $mixed_pairs = array_chunk($mixed_exercises, 2); // when I replace it with this, it works
return $mixed_pairs;
}
Oh for Pete's sake: I mentioned this to a friend, who told me that union is flukey in php, and that I should use array_merge instead.
I replaced these lines:
$all_pairs += $together_pairs;
$all_pairs += $mixed_pairs;
with this:
$all_pairs = array_merge($together_pairs, $mixed_pairs);
And now it all works
I was wondering, do too many IF statements bloat coding and when is it okay not to use them?
These two examples both work the same and I'm the only one editing / using the script. Am I teaching myself bad habits by not adding the IF statement?
if ($en['mm_place']) {
$tmp = explode(",", $en['mm_place']);
$en['mm_place'] = $tmp[0].", ".$tmp[1]." ".$tmp[2];
}
is the same as...
$tmp = explode(",", $en['mm_place']);
$en['mm_place'] = $tmp[0].", ".$tmp[1]." ".$tmp[2];
EDIT: using #Francis Avila example I came up with this...
if ($en['mm_wmeet']) {
$tmp = explode(",", $en['mm_wmeet']);
for ($i = 0; $i < count($tmp); $i++) {
$en['mm_wmeet'] = $tmp[$i];
}
}
In this particular example, they are not the same.
If $en['mm_place'] is empty, then $tmp will not have three elements, so your string construction will be bogus.
Actually what you need is probably this:
if (!empty($en['mm_place'])) { // depending on whether you know if this is set and must be a string.
$tmp = explode(',', $en['mm_place'], 3);
if (count($tmp)===3) {
$en['mm_place'] = "{$tmp[0]}, {$tmp[1]} {$tmp[2]}";
}
}
Run PHP with E_NOTICE set, and code in such a way that you don't get any notices. PHP requires an extraordinary amount of discipline to use safely and properly because it has so many sloppy misfeatures. Notices will inform you of most bad practices. You will probably end up using lots of if statements.
if they don't serve any purpose then yes, you're bloating.
In this kind of situation, where you are checking if an array element exists before operating on it, you should keep the if-statement in the code. Here it will only throw a notice if the element is missing, but in the future you definitely could have similar code that will crash if the element is not set.
Edit: Actually those two code samples are not the same. if $en['mm_place'] is null or not set, the first sample will leave it as such while the second will replace it with ", "
If I have this pattern to build ID's:
CX00 where the 0's are replaceable with just a number.. but the following is already in use:
- CX00
- CX02
- CX04
- CX05
- CX07
- CX10
- CX11
- CX12
How can I easily find, either via PHP or MySQL the values CX01, CX03, CX06, CX08, CX09, CX13+ as available values?
I don't know how your data is stored, so I'll leave getting the IDs in an array. Once you do, this will find the next available one.
<?php
function make_id($n) {
$out = 'CX';
if ($n < 10) $out .= '0';
return $out . $n;
}
// Get these from some source
$items = array('CX00', 'CX02', 'CX04', 'CX05');
$id = make_id(0);
for($i=0; in_array($id, $items); $i++)
$id = make_id($i);
echo $id;
?>
This is called a brute-force method, and if you have a lot of IDs there are probably more efficient ways. With 100 maximum, there shouldn't be any problems.
In php simply count up through the ids using a for loop until you have found enough of the unused ids... how many total ids do you expect to have?