PHP: Custom parser for non-formatted text - php

I am trying to create a project which will help students study various areas. The idea is that I have a piece of raw text, which contains quiz questions and answers which I want to parse as question header and answer options, which will be inserted into a database. However, the text is not properly formatted and due to the large amount of questions and answers (around ~20k per total), I cannot afford the time to manually insert them or format the text myself.
The raw text looks like this:
1. A car averages 27 miles per gallon. If gas costs $4.04 per gallon, which of the following is closest to how much the gas would cost for this car to travel 2,727 typical miles?
a) $44.44 b) $109.08 c) $118.80
d) $408.04 e)
$444.40
2. When x = 3 and y = 5, by how much does the value of 3x2 – 2y exceed the value of 2x2 – 3y ?
a) 4
b) 14
c) 16
d) 20 e) 50
I tried creating my own PHP functions to parse the text properly, however I cannot get myself to get past the random line breaks, spaces, etc.
What I am trying to obtain:
array(1) {
[0]=>
array(3) {
["questionNumber"]=>
string(1) "1"
["questionText"]=>
string(175) "A car averages 27 miles per gallon. If gas costs $4.04 per gallon, which of the following is closest to how much the gas would cost for this car to travel 2,727 typical miles?"
["options"]=>
array(5) {
["a"]=>
string(6) "$44.44"
["b"]=>
string(7) "$109.08"
["c"]=>
string(7) "$118.80"
["d"]=>
string(7) "$408.04"
["e"]=>
string(7) "$444.40"
}
}
}
The code I have so far:
$rawText = '1. A car averages 27 miles per gallon. If gas costs $4.04 per gallon, which of the following is closest to how much the gas would cost for this car to travel 2,727 typical miles?
a) $44.44 b) $109.08 c) $118.80
d) $408.04 e)
$444.40
2. When x = 3 and y = 5, by how much does the value of 3x2 – 2y exceed the value of 2x2 – 3y ?
a) 4
b) 14
c) 16
d) 20 e) 50
';
$rawTextLines = explode("\n", $rawText);
foreach ($rawTextLines as $lineNumber => $lineContents) {
$lContents = trim($lineContents);
if (empty ($lContents)) {
unset ($rawTextLines[$lineNumber]);
} else {
$rawTextLines[$lineNumber] = $lContents;
}
}
$processedQuestions = array ();
$currentQuestionHeader = 0;
foreach ($rawTextLines as $lineNumber => $lineContents) {
if (ctype_digit(substr($lineContents, 0, 1))) { // Question header
$questionHeaderInformation = explode('.', $lineContents);
$currentQuestionHeader = $questionHeaderInformation[0];
$processedQuestions[$currentQuestionHeader]['questionNumber'] = $currentQuestionHeader;
$processedQuestions[$currentQuestionHeader]['questionText'] = $questionHeaderInformation[1];
} else { // Question option
$options = explode(')', $lineContents);
if (count ($options) % 2 === 0) {
$processedQuestions[$currentQuestionHeader]['options'][trim($options[0])] = ucfirst(trim($options[1]));
} else {
}
}
}
Which produces this:
array(2) {
[1]=>
array(3) {
["questionNumber"]=>
string(1) "1"
["questionText"]=>
string(35) " A car averages 27 miles per gallon"
["options"]=>
array(1) {
["a"]=>
string(8) "$44.44 b"
}
}
[2]=>
array(3) {
["questionNumber"]=>
string(1) "2"
["questionText"]=>
string(96) " When x = 3 and y = 5, by how much does the value of 3x2 – 2y exceed the value of 2x2 – 3y ?"
["options"]=>
array(3) {
["a"]=>
string(1) "4"
["b"]=>
string(2) "14"
["c"]=>
string(2) "16"
}
}
}
As you can see, the current output does not match - not by far, what I am trying to obtain.
Thank you in advance.

Hellow,
^[0-9]+\. (.*)[\r\n]+a\)[\s]+(.*)[\s]+b\)[\s]+(.*)[\s]+c\)[\s]+(.*)[\s]+d\)[\s]+(.*)[\s]+e\)[\s]+(.*)[\s]*
Try it !
$re = '/^[0-9]+\. (.*)[\r\n]+a\)[\s]+(.*)[\s]+b\)[\s]+(.*)[\s]+c\)[\s]+(.*) [\s]+d\)[\s]+(.*)[\s]+e\)[\s]+(.*)[\s]*/m';
$str = '1. A car averages 27 miles per gallon. If gas costs $4.04 per gallon, which of the following is closest to how much the gas would cost for this car to travel 2,727 typical miles?
a) $44.44 b) $109.08 c) $118.80
d) $408.04 e)
$444.40
2. When x = 3 and y = 5, by how much does the value of 3x2 – 2y exceed the value of 2x2 – 3y ?
a) 4
b) 14
c) 16
d) 20 e) 50';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
// Print the entire match result
var_dump($matches);

Related

Adding year and leading zeroes in id

I have column named ID and I want it to indicate year that record when it was created with format.
Scenario
I created new data on this day( 1-11-2019) and with the format.
2019-01-110000-1
And when it is reached to 9 it will become 0000-00010
Example:
20190101-0000-0001
...
20190101-0000-0010
View
<?= $purchase_order->date . sprintf("%06s",-$counter) ?>
Question: How can I add leading zeroes in 0000-1(it should be like this 0000-0001 and when it reached 9 it will become 0000-0010
Math.
function weirdNumberFormatter(int $input) {
$high = intdiv($input, 10000);
$low = $input % 10000;
return sprintf('%04d-%04d', $high, $low);
}
var_dump(
weirdNumberFormatter(1),
weirdNumberFormatter(1234),
weirdNumberFormatter(12345),
weirdNumberFormatter(1234567),
weirdNumberFormatter(12345678)
);
Output:
string(9) "0000-0001"
string(9) "0000-1234"
string(9) "0001-2345"
string(9) "0123-4567"
string(9) "1234-5678"

PHP - Read a result txt file to get vars

what I'm trying to do looks impossible with my actual PHP skills. Below you can find an exemple of a race result file, in txt. This file is composed of :
dir= the-track-file-name
longname= the-track-long-name-with-spaces
firstlap= the number of gate (checkpoint) the first lap is composed of
normallap= the number of gate (checkpoint) other laps are composed of
holeshotindex= thefirst gate after the start, which determine which player started first
time= the race duration, in minutes
laps= the number of laps (if minutes + laps, laps are counted when time's up)
starttime=1793
date= timestamp of the start
players:(under this line are all the player, just 1 in this exemple)
slot=0 (this is the multiplayer server slot taken by the player)
uid=5488 (this is the unique ID of the player)
number=755 (player's race number)
bike=rm125 (player's motorbike model)
name=Nico #755(player's name)
times: (under this line are things like timestamps of every gate, like SLOT|GATE|TIME)
0 0 1917 (it's like divide the timstamp /128 sounds good)
0 1 2184
(and etc, see full exemple below...)
The game server is on a dedicated ubuntu.
At each race end I send these results on an FTP web server, and what I need is to get vars to output something readable like a table with results after selecting a race (in a dropdown list i.e.).
Doing the table isn't the problem.
My problem is, even searching a lot here, that I don't know how to read the txt to obtain this kind of page (only RESULTS table) : http://mxsimulator.com/servers/q2.MXSConcept.com/races/6015.html
Here is a full sample result file : http://www.mediafire.com/view/3b34a4kd5nfsj4r/sample_result_file.txt
Thank you
Ok, tonight it's file parsing time.
I've written a very basic parser, which walks through the data line by line.
First it looks for "=". When a "=" is found the line is split/exploded at "=".
You get two parts: before and after the "=".
I've used them as key and value in an $results array.
This process continues till we reach the line "times:".
That's the line indicating that on the next line (line "times:" + 1) the results start.
The results are "slot gate time" separated by spaces. So the results are exploded with " " (space) this time and you get the three parts.
I've inserted an array key 'times' which contains an array with named keys (slot,gate,time).
You might just look at the structure of the $results array.
It should be very easy to iterate over it to render a table or output data.
#$datafile = 'http://www.mediafire.com/view/3b34a4kd5nfsj4r/sample_result_file.txt';
#$lines = file_get_contents($datafile);
$lines = '
dir=Dardon-Gueugnon
longname=Dardon Gueugnon
firstlap=72
normallap=71
holeshotindex=1
time=0
laps=6
starttime=1846
date=1407162774
players:
slot=0
uid=8240
number=172
bike=rm125
name=Maximilien Jannot | RH-Factory
slot=1
uid=7910
number=666
bike=rm125
name=Patrick Corvisier|Team RH-Factory
slot=2
uid=10380
number=114
bike=rm125
name=Benoit Krawiec | MXS-Concept.com
slot=6
uid=6037
number=59
bike=rm125
name=Yohan Levrage | SPEED
slot=8
uid=6932
number=447
bike=rm125
name=Morgan Marlet | Mxs-Concept.com
times:
6 0 1974
1 0 1989
0 0 2020
2 0 2056
6 1 2242
1 1 2260
0 1 2313
2 1 2338
6 2 2434
1 2 2452';
$results = array();
$parseResults = false;
#foreach($lines as $line){ // use this line when working with file_get_contents
foreach(preg_split("/((\r?\n)|(\r\n?))/", $lines) as $line){
if($parseResults === true) {
$parts = explode(' ', $line); // SLOT|GATE|TIME = parts 0|1|2
$times = array(
'slot' => $parts[0],
'gate' => $parts[1],
'time' => $parts[2]
);
$results['times'][] = $times;
}
if(false !== strpos($line, '=')) { // if line has a = in it, explode it
$parts = explode('=', $line);
$results[$parts[0]] = $parts[1]; // assign parts to array as key=value
}
if(false !== strpos($line, 'times:')) {
// we reached "times:", let's set a flag to start reading results in the next iteration
$parseResults = true;
}
}
var_dump($results);
Output:
array(15) {
["dir"]=> string(15) "Dardon-Gueugnon"
["longname"]=> string(15) "Dardon Gueugnon"
....
["name"]=> string(31) "Morgan Marlet | Mxs-Concept.com"
["times"]=> array(10) {
[0]=> array(3) { ["slot"]=> string(1) "6" ["gate"]=> string(1) "0" ["time"]=> string(4) "1974" }
[1]=> array(3) { ["slot"]=> string(1) "1" ["gate"]=> string(1) "0" ["time"]=> string(4) "1989" }
[2]=> array(3) { ["slot"]=> string(1) "0" ["gate"]=> string(1) "0" ["time"]=> string(4) "2020" }
...
} } }

values unnecessarily rounded

I have some database figures that I am doing some simple math with. For some reason, I can't keep the total from rounding to the nearest dollar. I need to include the cents information, as well, though. I am positive that each itemPrice entry contains two decimal places in the database.
if (strpos($row2["itemDiscount"],'%') !== false) {
$itemDiscount = $row2["itemDiscount"];
$itemDetailTotalUnformatted = $row2["itemQuantity"]*($itemPrice*(1-($itemDiscount/100)));
}
else {
$itemDetailTotalUnformatted = $row2["itemQuantity"]*($row2["itemPrice"]-$row2["itemDiscount"]);
}
$itemDetailTotal = number_format($itemDetailTotalUnformatted, 2, '.', '');
echo $itemDetailTotal;
var_dump($row2):
50.00array(6) {
[0]=>
string(1) "2"
"itemQuantity"]=>
string(1) "2"
[1]=>
string(5) "30.00"
[itemPrice]=>
string(1) "30.00"
[2]=>
string(4) "5.00"
[itemPrice]=>
string(4) "5.00"
When dealing with currency, ALWAYS work in integers. Save the prices in cents, handle prices in cents, and only at the very end do you divide by 100 to present the result.
The reason for this is that ints have perfect precision (up to obscenely high values, where they are handled as floats instead), whereas floats do not. There is no fixed-point type in PHP.
Once you do that, your rounding problems will probably disappear.

How to make 5 random numbers with sum of 100 [duplicate]

This question already has answers here:
Getting N random numbers whose sum is M
(9 answers)
Closed 1 year ago.
do you know a way to split an integer into say... 5 groups.
Each group total must be at random but the total of them must equal a fixed number.
for example I have "100" I wanna split this number into
1- 20
2- 3
3- 34
4- 15
5- 18
EDIT: i forgot to say that yes a balance would be a good thing.I suppose this could be done by making a if statement blocking any number above 30 instance.
I have a slightly different approach to some of the answers here. I create a loose percentage based on the number of items you want to sum, and then plus or minus 10% on a random basis.
I then do this n-1 times (n is total of iterations), so you have a remainder. The remainder is then the last number, which isn't itself truley random, but it's based off other random numbers.
Works pretty well.
/**
* Calculate n random numbers that sum y.
* Function calculates a percentage based on the number
* required, gives a random number around that number, then
* deducts the rest from the total for the final number.
* Final number cannot be truely random, as it's a fixed total,
* but it will appear random, as it's based on other random
* values.
*
* #author Mike Griffiths
* #return Array
*/
private function _random_numbers_sum($num_numbers=3, $total=500)
{
$numbers = [];
$loose_pcc = $total / $num_numbers;
for($i = 1; $i < $num_numbers; $i++) {
// Random number +/- 10%
$ten_pcc = $loose_pcc * 0.1;
$rand_num = mt_rand( ($loose_pcc - $ten_pcc), ($loose_pcc + $ten_pcc) );
$numbers[] = $rand_num;
}
// $numbers now contains 1 less number than it should do, sum
// all the numbers and use the difference as final number.
$numbers_total = array_sum($numbers);
$numbers[] = $total - $numbers_total;
return $numbers;
}
This:
$random = $this->_random_numbers_sum();
echo 'Total: '. array_sum($random) ."\n";
print_r($random);
Outputs:
Total: 500
Array
(
[0] => 167
[1] => 164
[2] => 169
)
Pick 4 random numbers, each around an average of 20 (with distribution of e.g. around 40% of 20, i.e. 8). Add a fifth number such that the total is 100.
In response to several other answers here, in fact the last number cannot be random, because the sum is fixed. As an explanation, in below image, there are only 4 points (smaller ticks) that can be randomly choosen, represented accumulatively with each adding a random number around the mean of all (total/n, 20) to have a sum of 100. The result is 5 spacings, representing the 5 random numbers you are looking for.
Depending on how random you need it to be and how resource rich is the environment you plan to run the script, you might try the following approach.
<?php
set_time_limit(10);
$number_of_groups = 5;
$sum_to = 100;
$groups = array();
$group = 0;
while(array_sum($groups) != $sum_to)
{
$groups[$group] = mt_rand(0, $sum_to/mt_rand(1,5));
if(++$group == $number_of_groups)
{
$group = 0;
}
}
The example of generated result, will look something like this. Pretty random.
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(11)
[1]=>
int(2)
[2]=>
int(13)
[3]=>
int(9)
[4]=>
int(65)
}
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(9)
[1]=>
int(29)
[2]=>
int(21)
[3]=>
int(27)
[4]=>
int(14)
}
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(18)
[1]=>
int(26)
[2]=>
int(2)
[3]=>
int(5)
[4]=>
int(49)
}
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(20)
[1]=>
int(25)
[2]=>
int(27)
[3]=>
int(26)
[4]=>
int(2)
}
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(9)
[1]=>
int(18)
[2]=>
int(56)
[3]=>
int(12)
[4]=>
int(5)
}
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(0)
[1]=>
int(50)
[2]=>
int(25)
[3]=>
int(17)
[4]=>
int(8)
}
[root#server ~]# php /var/www/dev/test.php
array(5) {
[0]=>
int(17)
[1]=>
int(43)
[2]=>
int(20)
[3]=>
int(3)
[4]=>
int(17)
}
$number = 100;
$numbers = array();
$iteration = 0;
while($number > 0 && $iteration < 5) {
$sub_number = rand(1,$number);
if (in_array($sub_number, $numbers)) {
continue;
}
$iteration++;
$number -= $sub_number;
$numbers[] = $sub_number;
}
if ($number != 0) {
$numbers[] = $number;
}
print_r($numbers);
This should do what you need:
<?php
$tot = 100;
$groups = 5;
$numbers = array();
for($i = 1; $i < $groups; $i++) {
$num = rand(1, $tot-($groups-$i));
$tot -= $num;
$numbers[] = $num;
}
$numbers[] = $tot;
It won't give you a truly balanced distribution, though, since the first numbers will on average be larger.
I think the trick to this is to keep setting the ceiling for your random # generator to 100 - currentTotal
The solution depends on how random you want your values to be, in other words, what random situation you're going to simulate.
To get totally random distribution, you'll have to do 100 polls in which each element will be binded to a group, in symbolic language
foreach i from 1 to n
group[ random(1,n) ] ++;
For bigger numbers, you could increase the selected group by random(1, n/100) or something like that until the total sum would match the n.
However, you want to get the balance, so I think the best for you would be the normal distribution. Draw 5 gaussian values, which will divide the number (their sum) into 5 parts. Now you need to scale this parts so that their sum would be n and round them, so you got your 5 groups.
The solution I found to this problem is a little different but makes makes more sense to me, so in this example I generate an array of numbers that add up to 960. Hope this is helpful.
// the range of the array
$arry = range(1, 999, 1);
// howmany numbers do you want
$nrresult = 3;
do {
//select three numbers from the array
$arry_rand = array_rand ( $arry, $nrresult );
$arry_fin = array_sum($arry_rand);
// dont stop till they sum 960
} while ( $arry_fin != 960 );
//to see the results
foreach ($arry_rand as $aryid) {
echo $arryid . '+ ';
}

PHP is not returning me a number type

I tried to follow the instructions in this question: STAR rating with css
but I've just a big problem.
When I do:
<span class="stars">1.75</span>
or
$foo='1.75';
echo '<span class="stars">'.$foo.'</span>
the stars is correctly shown, but as soon as I do:
while($val = mysql_fetch_array($result))
{
$average = ($val['services'] + $val['serviceCli'] + $val['interface'] + $val['qualite'] + $val['rapport'] ) / 5 ;
<span class="stars">.$average.</span>
}
the stars stops working
I double checked the data type in mysql:
they're all TINYINT(2)
and I tried this:
$average = intval($average);
but it's still not working,
I think your problem may be that the value you have is greater than the 5 allowed in that example.
What you want to do is weight the items such that the total for $average is less than or equal to 5.
$average = (
( $val['services'] / $maxServices )
+ ( $val['serviceCli'] / $maxServiceCli )
+ ( $val['interface'] / $maxInterface )
+ ( $val['qualite'] / $maxQualite )
+ ( $val['rapport'] / $maxRapport )
);
The weighting could be even, so each of the values will be less than or equal to 1, or they could have different weights so services is worth more than qualite (and so on).
no, var_dump($val) gives me ALL the value i get from my big request (text, varchar, int) [example :
array(38) { [0]=> string(28) "http://www.crystal-serv.com/"
["siteweb"]=> string(28) "http://www.crystal-serv.com/"
[1]=> string(1) "0" ["offreDedie"]=> string(100)
"tick rouge" [2]=> string(1) "0" ["coupon"]=>
..........

Categories