Formatting raw text gathered from another website using php

Formatting raw text gathered from another website using php - php

I am trying to retrieve a table from another website, which is based on several variables passed to it via a form. I have worked out that the url details after the ? correspond to those variables and have created a form on my page to post those variables and create url, which I have then put into a file_get_contents process, whereby I collect the table as data (I have narrowed the get to the div in which the table is housed).
My problem is that the data is shown as a string of plain text on my page with no formatting (i.e. no columns or rows).
Here is the code to retrieve the data:
<?php
$page = file_get_contents($stats_url);
$doc = new DOMDocument();
$doc->loadHTML($page);
$divs = $doc->getElementsByTagName('div');
foreach($divs as $div) {
// Loop through the DIVs looking for one withan id of "content"
// Then echo out its contents (pardon the pun)
if ($div->getAttribute('id') === 'statstable') {
echo $div->nodeValue;
}
}
?>
Here is a sample of the data returned:
NameGamesInnsNot OutsRunsHigh ScoreAvg50's100'sDucksStrike RateBowled (%)Caught (%)LBW (%)Stumped (%)Run Out (%)Not Out (%)Did Not Bat (%)%Games Won%Games Drawn%Games Lost%Team RunsCatchesStumpingsRun OutsOwais Fareed 1 1 0 72 7272 1 0 0 - 0 1 (100) 0 0 0 0 0 0 0 100 42.6 0 0 0 Atif Ali 2 2 0 28 2814 0 0 1 - 2 (100) 0 0 0 0 0 0 0 0 100 11.62 0 0 0 Craig Hills 2 2 0 20 1310 0 0 0 - 0 1 (50) 1 (50) 0 0 0 0 0 0 100 8.3 1 0 0 Dale Skeath 2 2 0 16 128 0 0 0 - 1 (50) 1 (50) 0 0 0 0 0 0 0 100 6.64 1 0 0 ash ashim 2 2 1 16 10*16 0 0 0 - 0 1 (50) 0 0 0 1 (50) 0 0 0 100 6.64 0 0 0 Hussain Dalvi 1 1 0 11 1111 0 0 0 - 0 1 (100) 0 0 0 0 0 0 0 100 6.51 0 0 0 Azhar Ali 1 1 0 11 1111 0 0 0 - 0 1 (100) 0 0 0 0 0 0 0 100 6.51 0 0 0 A Hammed 1 1 0 10 1010 0 0 0 - 0 1 (100) 0 0 0 0 0 0 0 100 5.92 0 0 0 M Ali 1 1 0 5 55 0 0 0 - 1 (100) 0 0 0 0 0 0 0 0 100 2.96 0 0 0 Simon Pleasant 1 1 0 5 55 0 0 0 - 0 1 (100) 0 0 0 0 0 0 0 100 6.94 0 0 0
How can I then take this text and recompile it as a table?

Check out PHP Simple HTML DOM Parser
It works brilliantly for this stuff.
http://simplehtmldom.sourceforge.net/

Related

Find and Replace a Column in CSV1 based on CSV2 in PHP

I am trying replace the values of column "Close" in the first CSV with the values of column "LTP" in the second CSV where the following columns should be same in CSV2 and CSV1:
"Symbol=SYMBOL", "Date=TIMESTAMP", "Expiry=EXPIRY_DT", "Option Type=OPTION_TYP", "Strike Price=STRIKE_PR"
Here is the current structure of CSV1:
INSTRUMENT SYMBOL EXPIRY_DT STRIKE_PR OPTION_TYP OPEN HIGH LOW CLOSE SETTLE_PR CONTRACTS VAL_INLAKH OPEN_INT CHG_IN_OI TIMESTAMP
OPTSTK INFY 26-Apr-18 780 CE 0 0 0 408.6 361.05 0 0 0 0 2-Apr-18
OPTSTK INFY 26-Apr-18 800 CE 0 0 0 388.95 341.15 0 0 0 0 2-Apr-18
OPTSTK INFY 26-Apr-18 820 CE 0 0 0 369.35 321.25 0 0 0 0 2-Apr-18
OPTSTK INFY 26-Apr-18 840 CE 0 0 0 349.75 301.35 0 0 0 0 2-Apr-18
OPTSTK INFY 26-Apr-18 860 CE 0 0 0 330.2 281.45 0 0 0 0 2-Apr-18
OPTSTK INFY 26-Apr-18 880 CE 0 0 0 310.75 261.55 0 0 0 0 2-Apr-18
OPTSTK INFY 26-Apr-18 900 CE 0 0 0 291.35 241.65 0 0 0 0 2-Apr-18
OPTSTK INFY 26-Apr-18 920 CE 0 0 0 272.15 221.75 0 0 0 0 2-Apr-18
OPTSTK INFY 26-Apr-18 940 CE 0 0 0 216 201.85 0 0 5400 0 2-Apr-18
OPTSTK INFY 26-Apr-18 960 CE 0 0 0 234.45 181.95 0 0 0 0 2-Apr-18
OPTSTK INFY 26-Apr-18 980 CE 0 0 0 216.1 162.1 0 0 0 0 2-Apr-18
OPTSTK INFY 26-Apr-18 1000 CE 0 0 0 136 142.3 0 0 24600 0 2-Apr-18
And the structure of CSV2 is:
Symbol Date Expiry Option Type Strike Price Open High Low Close LTP Settle Price No. of contracts Turnover in Lacs Premium Turnover in Lacs Open Int Change in OI Underlying Value
INFY 2-Apr-18 26-Apr-18 CE 780 0 0 0 408.6 0 361.05 0 0 0 0 0 1137.15
INFY 2-Apr-18 26-Apr-18 CE 1380 1 1 1 1 1 1 1 8.29 0.01 600 600 1137.15
INFY 2-Apr-18 26-Apr-18 CE 820 0 0 0 369.35 0 321.25 0 0 0 0 0 1137.15
INFY 2-Apr-18 26-Apr-18 CE 840 0 0 0 349.75 0 301.35 0 0 0 0 0 1137.15
INFY 2-Apr-18 26-Apr-18 CE 860 0 0 0 330.2 0 281.45 0 0 0 0 0 1137.15
INFY 2-Apr-18 26-Apr-18 CE 880 0 0 0 310.75 0 261.55 0 0 0 0 0 1137.15
INFY 2-Apr-18 26-Apr-18 CE 900 0 0 0 291.35 0 241.65 0 0 0 0 0 1137.15
INFY 2-Apr-18 26-Apr-18 CE 920 0 0 0 272.15 0 221.75 0 0 0 0 0 1137.15
INFY 2-Apr-18 26-Apr-18 CE 940 0 0 0 216 216 201.85 0 0 0 5400 0 1137.15
INFY 2-Apr-18 26-Apr-18 CE 960 0 0 0 234.45 0 181.95 0 0 0 0 0 1137.15
INFY 2-Apr-18 26-Apr-18 CE 980 0 0 0 216.1 0 162.1 0 0 0 0 0 1137.15
INFY 2-Apr-18 26-Apr-18 CE 1000 0 0 0 136 134 142.3 0 0 0 24600 0 1137.15
INFY 2-Apr-18 26-Apr-18 CE 1020 0 0 0 126 126 122.65 0 0 0 1200 0 1137.15

A few things about CSV and in general any file
I am trying replace the values of column "Close" in the first CSV with the values of column "LTP" in the second CSV where the following columns.
You cannot "replace" anything in the file. You can however pull the data from a file using fgetcsv then change it and write a new file with the changed data using fputcsv.
As you have shown "no code", I won't bother writing much. But I will just go over the process.
read in the file (completely) with the new values in it, creating an array from it
read in the file you wish to change row by row
for each row of the second file, find the correct row in the first one (array) and update the data for that line in file #2 (your file #1)
write a new file that contains the updated data.
you have to read the file you want to change the data in last, because you want to preserve the order of the rows in that file, and well it's just easier that way.
A tip for matching the data. When you read in file #1 take the fields you need to match and hash them using md5. Md5 is really fast on the order of something like 5 million hashes a second (yes I timed it)
$time = microtime(true);
for($i=0; $i <= 5000000; ++$i){
md5("hello world");
}
echo "Complete :".number_format((microtime(true) - $time), 4);
Try it yourself
Output
Complete :1.3057
Now this has nothing to do with security or anything else, I will explain in a second.
When you build you array (Step 1) take the fields you want to match and hash them and use that for the key in your array.
$hash = md5(
$row['Symbol'] .
strtotime($row['Date']) . //convert to timestamp
$row['Expiry'] .
$row['Option Typ'] .
$row['Strike Price']
);
$data[$hash] = $row;
Then do the same when you read in the second file (the file you want to modify)
$newHash = md5(
$row['SYMBOL'] .
TIMESTAMP . //convert to timestamp
$row['EXPIRY_DT'] .
$row['OPTION_TYP'] .
$row['STRIKE_PR'] .
);
Then simply look up the hash in the $data array isset($data[$newHash]) as yo go through the file you are modifying. Once you find the row with that data you need the rest is pretty trivial.
Now the basic way to read a file is
$h = fopen('filename.csv', 'r');
$data = [];
$headers = false;
while(!feof($h)){
$row = fgetcsv($h);
if(!$headers) $headers = $row;
$row = array_combine($headers, $row);
//do the hash "and other stuff"
$data[$hash] = $row;
}
Then read the other file after that one is done, with the same kind of loop, do the hash , match the data you want, update the old data, write the row in a new (3rd) file etc...
For the hashing to work, the data must match from one CSV to the other (obviously), there cannot be duplicate "hashed" rows. Rows who's data hashes the same but which has values in the other fields. Like if you had 2 rows in the same file that hash the same and then the Close value is different. This would cause you trouble no matter what.
Good luck, look up some tutorials on how to process CSV files, there are plenty of them out there.

How do I make my word unscrambler return more relevant results

I am building a word unscrambler (php/mysql) that takes user input of between 2 and 8 letters and returns words of between 2 and 8 letters that can be made from those letters, not necessarily using all of the letters, but definitely not including more letters than supplied.
The user will enter something like MSIKE or MSIKEI (two i's), or any combination of letters or multiple occurrences of a letter.
The query below will find all occurrences of words that contain M, S, I, K, or E.
However, the query below also returns words that have multiple occurrences of letters not requested. For example, the word meek would be returned, even though it has two e's and the user didn't enter two e's, or the word kiss, even though the user didn't enter s twice.
SELECT word
FROM words
WHERE word REGEXP '[msike]'
AND has_a=0
AND has_b=0
AND has_c=0
AND has_d=0
(we skip e) or we could add has_e=1
AND has_f=0
...and so on...skipping letters m, s, i, k, and e
AND has_w=0
AND has_x=0
AND has_y=0
AND has_z=0
Note the columns has_a, has_b, etc are either 1 if the letter occurs in the word or 0 if not.
I am open to any changes to the table schema.
This site: http://grecni.com/texttwist.php is a good example of what I am trying to emulate.
Question is how to modify the query to not return words with multiple occurrences of a letter, unless the user specifically entered a letter multiple times. Grouping by word length would be an added bonus.
Thanks so much.
EDIT: I altered the db per the suggestion of #awei, The has_{letter} is now count_{letter} and stores the total number of occurrences of the respective letter in the respective word. This could be useful when a user enters a letter multiple times. example: user enters MSIKES (two s).
Additionally, I have abandoned the REGEXP approach as shown in the original SQL statement. Working on doing most of the work on the PHP side, but many hurdles still in the way.
EDIT: Included first 10 rows from table
id word alpha otcwl ospd csw sowpods dictionary enable vowels consonants start_with end_with end_with_ing end_with_ly end_with_xy count_a count_b count_c count_d count_e count_f count_g count_h count_i count_j count_k count_l count_m count_n count_o count_p count_q count_r count_s count_t count_u count_v count_w count_x count_y count_z q_no_u letter_count scrabble_points wwf_points status date_added
1 aa aa 1 0 0 1 1 1 aa a a 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 1 2015-11-12 05:39:45
2 aah aah 1 0 0 1 0 1 aa h a h 0 0 0 2 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 6 5 1 2015-11-12 05:39:45
3 aahed aadeh 1 0 0 1 0 1 aae hd a d 0 0 0 2 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 9 8 1 2015-11-12 05:39:45
4 aahing aaghin 1 0 0 1 0 1 aai hng a g 1 0 0 2 0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 6 10 11 1 2015-11-12 05:39:45
5 aahs aahs 1 0 0 1 0 1 aa hs a s 0 0 0 2 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 4 7 6 1 2015-11-12 05:39:45
6 aal aal 1 0 0 1 0 1 aa l a l 0 0 0 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 3 4 1 2015-11-12 05:39:45
7 aalii aaiil 1 0 0 1 1 1 aaii l a i 0 0 0 2 0 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 5 6 1 2015-11-12 05:39:45
8 aaliis aaiils 1 0 0 1 0 1 aaii ls a s 0 0 0 2 0 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 6 6 7 1 2015-11-12 05:39:45
9 aals aals 1 0 0 1 0 1 aa ls a s 0 0 0 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 4 4 5 1 2015-11-12 05:39:45
10 aardvark aaadkrrv 1 0 0 1 1 1 aaa rdvrk a k 0 0 0 3 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 8 16 17 1 2015-11-12 05:39:45

Think you've already done the hard work with your revised schema. All you need to do now is modify the query to look for <= the number of counts of each letter as specified by the user.
E.g. if the user entered "ALIAS":
SELECT word
FROM words
WHERE count_a <= 2
AND count_b <= 0
AND count_c <= 0
AND count_d <= 0
AND count_e <= 0
AND count_f <= 0
AND count_g <= 0
AND count_h <= 0
AND count_i <= 1
AND count_j <= 0
AND count_k <= 0
AND count_l <= 1
AND count_m <= 0
AND count_n <= 0
AND count_o <= 0
AND count_p <= 0
AND count_q <= 0
AND count_r <= 0
AND count_s <= 1
AND count_t <= 0
AND count_u <= 0
AND count_v <= 0
AND count_w <= 0
AND count_x <= 0
AND count_y <= 0
AND count_z <= 0
ORDER BY CHAR_LENGTH(word), word;
Note: As requested, this is ordering by word length, then alphabetically. Have used <= even for <= 0 just to make it easier to modify by hand for other letters.
This returns "aa", "aal" and "aals" (but not "aalii" or "aaliis" since they both have two "i"s).
See SQL Fiddle Demo.

Since you have two different requirements, I suggest implementing both two different solutions.
Where you don't care about dup letters, build a SET datatype with the 26 letters. Populate the bits according what the word has. This ignores duplicate letters. This also facilitates looking for words with a subset of the letters: (the_set & ~the_letters) = 0.
Where you do care about dups, sort the letters in the word and store that as the key. "msike" becomes "eikms".
Build a table that contains 3 columns:
eikms -- non unique index on this
msike -- the real word - probably good to have this as the PRIMARY KEY
SET('m','s','i',','k','e') -- for the other situation.
msikei and meek would be entered as
eikms
msikei
SET('m','s','i',','k','e') -- (or, if more convenient: SET('m','i','s','i',','k','e')
ekm
meek
SET('e','k','m')
REGEXP is not practical for your task.
Edit 1
I think you also need a column that indicates whether there are any doubled letters in the word. That way, you can distinguish that kiss is allowed for msikes but for for msike.
Edit 2
A SET or an INT UNSIGNED can hold 1 bit for each of the 26 letters -- 0 for not present, 1 for present.
msikes and msike would both go into the set with exactly 5 bits turned on. The value to INSERT would be 'm,s,i,k,e,s' for msikes. Since the rest needs to involve Boolean arithmetic, maybe it would be better to use INT UNSIGNED. So...
a is 1 (1 << 0)
b is 2 (1 << 1)
c is 4 (1 << 2)
d is 8 (1 << 3)
...
z is (1 << 25)
To INSERT you use the | operator. bad becomes
(1 << 1) | (1 << 0) | (1 << 3)
Note how the bits are laid out, with 'a' at the bottom:
SELECT BIN((1 << 1) | (1 << 0) | (1 << 3)); ==> 1011
Similarly 'ad' is 1001. So, does 'ad' match 'bad'? The answer comes from
SELECT b'1001' & ~b'1011' = 0; ==> 1 (meaning 'true')
That means that all the letters in 'ad' (1001) are found in 'bad' (1011). Let's introduce "bed", which is 11010.
SELECT b'11010' & ~b'1011' = 0; ==> FALSE because of 'e' (10000)
But 'dad' (1001) will work fine:
SELECT b'1001' & ~b'1011' = 0; ==> TRUE
So, now comes the "dup" flag. Since 'dad' has dup letters, but 'bad' did not, your rules say that it is not a match. But it took the "dup" to finish the decision.
If you have not had a course in Boolean arithmetic, well, I have just presented the first couple of chapters. If I covered it too fast, find a math book on such and jump in. "It's not rocket science."
So, back to what code is needed to decide whether my_word has a subset of letters and whether it is allowed to have duplicate letters:
SELECT $my_mask & ~tbl.mask = 0, dup FROM tbl;
Then do the suitable AND / OR between to finish the logic.

With the limited Regex support on MySQL, best I can do is a PHP script for generating the query, presuming it only includes English letters. It seems making an expression to exclude invalid words is easier than one that includes them.
<?php
$inputword = str_split('msikes');
$counter = array();
for ($l = 'a'; $l < 'z'; $l++) {
$counter[$l] = 0;
}
foreach ($inputword as $l) {
$counter[$l]++;
}
$nots = '';
foreach ($counter as $l => $c) {
if (!$c) {
$nots .= $l;
unset($counter[$l]);
}
}
$conditions = array();
if(!empty($nots)) {
// exclude words that have letters not given
$conditions[] = "[" . $nots . "]'";
}
foreach ($counter as $l => $c) {
$letters = array();
for ($i = 0; $i <= $c; $i++) {
$letters[] = $l;
}
// exclude words that have the current letter more times than given
$conditions[] = implode('.*', $letters);
}
$sql = "SELECT word FROM words WHERE word NOT RLIKE '" . implode('|', $conditions) . "'";
echo $sql;

Something like this might work for you:
// Input Word
$WORD = strtolower('msikes');
// Alpha Array
$Alpha = range('a', 'z');
// Turn it into letters.
$Splited = str_split($WORD);
$Letters = array();
// Count occurrence of each letter, use letter as key to make it unique
foreach( $Splited as $Letter ) {
$Letters[$Letter] = array_key_exists($Letter, $Letters) ? $Letters[$Letter] + 1 : 1;
}
// Build a list of letters that shouldn't be present in the word
$ShouldNotExists = array_filter($Alpha, function ($Letter) use ($Letters) {
return ! array_key_exists($Letter, $Letters);
});
#### Building SQL Statement
// Letters to skip
$SkipLetters = array();
foreach( $ShouldNotExists as $SkipLetter ) {
$SkipLetters[] = "`has_{$SkipLetter}` = 0";
}
// count condition (for multiple occurrences)
$CountLetters = array();
foreach( $Letters as $K => $V ) {
$CountLetters[] = "`count_{$K}` <= {$V}";
}
$SQL = 'SELECT `word` FROM `words` WHERE '.PHP_EOL;
$SQL .= '('.implode(' AND ', $SkipLetters).')'.PHP_EOL;
$SQL .= ' AND ('.implode(' AND ', $CountLetters).')'.PHP_EOL;
$SQL .= ' ORDER BY LENGTH(`word`), `word`'.PHP_EOL;
echo $SQL;

To Convert an unstructured string to 2d array om PHP

I have parsed some data from a file into a string variable,
CPU NFS CIFS HTTP Total Net kB/s HDD kB/s SSD kB/s Tape kB/s Cache Cache CP CP HDD SSD OTHER FCP iSCSI FCP kB/s iSCSI kB/s
in out read write read write read write age hit time ty util util in out in out
65% 0 0 0 11357 97020 2846 0 156160 0 0 0 0 >60 100% 100% :f 45% 0% 3 0 11354 0 0 92987 0
67% 0 0 0 11761 100535 2943 511 161119 0 0 0 0 >60 100% 100% :f 43% 0% 0 0 11761 0 0 96397 0
66% 0 0 0 11911 101736 2984 276 151088 0 0 0 0 >60 100% 100% :v 48% 0% 0 0 11911 0 0 97534 0
56% 0 0 0 12026 102664 3094 36 24 0 0 0 0 >60 100% 1% : 2% 0% 11 0 12015 0 0 98419 0
81% 0 0 0 10023 85660 2317 1964 198416 0 0 0 0 >60 100% 83% Ff 60% 0% 0 0 10023 0 0 82117 0
67% 0 0 0 11914 101825 2993 336 152883 0 0 0 0 >60 100% 100% :f 55% 0% 0 0 11914 0 0 97625 0
67% 0 0 0 11526 98491 2869 256 151040 0 0 0 0 >60 100% 100% :f 51% 0% 0 0 11526 0 0 94388 0
66% 0 0 0 11589 99011 2931 0 143225 0 0 0 0 >60 100% 100% :f 51% 0% 0 0 11589 0 0 94949 0
57% 0 0 0 11869 101355 3032 56 20544 0 0 0 0 >60 100% 26% : 10% 0% 7 0 11862 0 0 97182 0
76% 0 0 0 9408 79189 2212 2022 122504 0 0 0 0 >60 100% 48% Fn 38% 0% 223 0 9185 0 0 75939 0
74% 0 0 0 10978 92981 2651 572 147078 0 0 0 0 >60 100% 100% :f 53% 0% 19 0 10959 0 0 89095 0
67% 0 0 0 11839 101109 2946 8 148332 0 0 0 0 >60 100% 100% :f 56% 0% 0 0 11839 0 0 96954 0
64% 0 0 0 11517 98413 2899 256 138248 0 0 0 0 >60 100% 100% :f 51% 0% 0 0 11517 0 0 94355 0
62% 0 0 0 11653 99151 2920 559 106198 0 0 0 0 >60 100% 81% : 40% 0% 52 0 11601 0 0 95030 0
56% 0 0 0 11765 99752 2973 577 3009 0 0 0 0 >60 100% 3% Fn 2% 0% 100 0 11665 0 0 95652 0
82% 0 0 0 9987 85219 2327 1570 207259 0 0 0 0 >60 100% 100% :f 60% 0% 5 0 9982 0 0 81692 0
67% 0 0 0 11859 101347 2970 0 158696 0 0 0 0 >60 100% 100% :f 57% 0% 0 0 11859
Into a 2D Array, so that I can take average of one of the column ,
Can someone let me know how it is done ?
My code:
preg_match('/'.preg_quote($word1).'(.*?)'.preg_quote($word2).'/is', $akshay_file, $match);
$text2 = nl2br($match[1]);
echo "$text2";
Can someone let me know how it is done, am I doing it the wrong way?
Thanks.

Guess you are reading this from a file. As an option, read the file line by line and consider the following code
<?php
$n = "65% 0 0 0 11357 97020 2846 0 156160 0 0 0 0 >60 100% 100% :f 45% 0% 3 0 11354 0 0 92987 0";
$n = preg_replace('!\s+!', ' ', $n);
$parts = explode(' ', $n);
echo '<pre>'.print_r($parts, true).'</pre>';
?>

At first you should split the code in lines and then split each line by whitespaces.
$lines = preg_split('/$\R?^/m', $data);
foreach ($lines as $line) {
$cols = preg_split('/\s\s+/', $data);
// do sth..
}

How can I monitor memory consumption during webserver stress test?

I want to do a stress test on an Apache webserver that I have running on localhost. The test will request the webserver to execute a PHP application that I wrote. I want to see how much memory (RAM) the webserver (and/or the associated PHP process) consumes during the test. Or to see how much it consumed after the test is done.
My OS is Ubuntu 13.10.
I looked at Apache Bench, Apache JMeter, Siege and httperf. None of them seem to provide such information. At most, I can see some CPU load in httperf (which in most cases is 100 %, so not too relevant).
Is there some tool that can provide me with memory consumption information ? It doesn't have to be a webserver benchmarking tool, could also be another Linux software that runs in parallel with the benchmarking tool. I just think that manually monitoring the test via the top command is kind of innacurate/ammateurish. Thank you in advance.

htop may be exactly what you're looking for.
Personally, I recently discovered something called byobu - which gives you a handy readout on the bottom (which you can configure by pressing F9) --
And that has become my personal favorite for exactly what you're describing.
Although, you could also look into xdebug -- and use something like xdebug_memory_usage() -- in the php script you're testing to dump info into a log file at key points in your script

I've put up a few PHP cronjobs, too, when I manually start the script through console I want to see debug and stuff, too.
I put in a method like this:
protected $consoleUpdate;
protected function printMemoryUsage() {
if ((time() - $this->consoleUpdate) >= 3) {
$this->consoleUpdate = time();
echo "Memory: ",
round(memory_get_usage(true) / (1024 * 1024)),
" MB",
"\r";
}
}
Call this method as often as you like to print the scripts memory usage.
Notice the final \r in the console, which returns the cursor to the line beginning and overwrites the line. If you don't have any other output, this has the effect of your screen not moving, instead, it gets updated.

Things like top, htop, memstat, iotop, mysqltop. All these tools are excellent to see what is thoroughly cooking your server while you throw siege (and its friend apachebench) at it.

I use vmstat for memory, disk and CPU monitoring. Below are some measurements whilst copying files on a bottom of the heap Linux based Raspberry Pi. I first used vmstat in the 1980’s, monitoring DB activity on early Unix systems. More details in:
http://www.roylongbottom.org.uk/Raspberry%20Pi%20Stress%20Tests.htm
vmstat was either run from a separate terminal or in a combined script file.
pi#raspberrypi /mnt $ time sudo sh -c "cp -r 256k /mnt/new2 && sync"
40 samles at 1 second intervals
vmstat 1 40 > vmstatRes.txt
real 0m38.781s
user 0m0.400s
sys 0m8.400s
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 304684 15208 65952 0 0 0 0 1109 319 39 2 59 0
1 0 0 282788 15208 87308 0 0 10292 0 4018 2994 5 57 9 29
1 1 0 256996 15208 112768 0 0 12380 0 4687 3863 3 53 0 44
2 2 0 231452 15212 138028 0 0 12288 40 4781 4024 5 55 0 40
0 1 0 216576 15216 152476 0 0 7004 10512 5649 3580 5 50 0 46
2 2 0 201688 15216 167288 0 0 7144 17488 5341 3527 2 52 0 46
1 0 0 195064 15216 173808 0 0 3192 9016 5909 3214 2 34 0 64
3 0 0 169520 15216 199152 0 0 12304 0 4704 3914 2 60 0 38
2 3 0 149988 15220 218288 0 0 9252 9892 5003 3614 2 52 0 45
0 2 0 131008 15224 237072 0 0 9112 10324 5086 3568 2 54 0 44
1 0 0 120160 15224 247784 0 0 5232 0 4976 2935 0 34 0 66
0 1 0 110424 15224 257404 0 0 4628 12864 5097 3034 4 36 0 60
1 0 0 86556 15224 281120 0 0 11536 0 4965 3874 3 54 0 43
1 1 0 73784 15224 293816 0 0 6188 11592 5545 3514 2 46 0 52
1 1 0 63252 15232 304132 0 0 4968 10320 4617 2748 2 34 0 64
0 1 0 43148 15232 323960 0 0 9652 7184 5126 3749 2 54 0 43
0 1 0 29336 15232 337560 0 0 6596 10036 4311 2796 2 38 0 59
1 1 0 23944 11696 346276 0 0 7480 0 5465 3455 2 46 0 52
2 1 0 23076 9580 349184 0 0 2860 10524 4521 2323 1 35 0 64
2 1 0 24440 5300 351508 0 0 8864 5188 4586 3215 1 66 0 33
0 1 0 24500 3900 352704 0 0 4896 11448 5974 3308 2 49 0 49
1 1 0 24432 3772 352700 0 0 10424 6208 4851 3682 2 60 0 38
1 1 0 23764 3772 353736 0 0 6568 5184 5970 3526 1 45 0 53
1 1 0 24068 3776 353500 0 0 4900 11388 5449 3142 0 40 0 60
0 1 0 24400 3780 352552 0 0 10068 8848 4821 3531 2 57 0 40
1 1 0 24152 3772 352588 0 0 8292 2784 5207 3588 2 50 0 48
1 1 0 23516 3772 353620 0 0 6800 7816 5475 3475 1 49 0 49
0 1 0 24260 3772 352940 0 0 7004 7424 5042 3284 4 43 0 52
2 1 0 24068 3776 353060 0 0 4624 10292 4798 2801 0 39 0 61
2 0 0 23820 3780 353340 0 0 8844 5508 5251 3609 0 56 0 44
2 1 0 24252 3772 352528 0 0 4552 12000 5053 2841 2 44 0 54
1 1 0 23696 3772 353120 0 0 10880 2176 4908 3694 2 58 0 40
1 0 0 24260 3772 352212 0 0 3748 11104 5208 2904 2 34 0 63
3 2 0 24136 3780 352084 0 0 10148 1628 4637 3568 1 55 0 44
0 1 0 24192 3780 352120 0 0 4016 10260 4719 2613 1 31 0 68
1 1 0 24392 3772 352076 0 0 6804 10972 5386 3473 1 52 0 47
1 1 0 24392 3772 351704 0 0 8568 8788 5101 3502 2 61 0 36
0 1 0 24376 3780 351764 0 0 0 30036 6711 1888 0 36 0 64
0 1 0 24252 3780 351928 0 0 28 2072 5629 1354 0 10 0 90
0 0 0 24768 3780 351968 0 0 40 20 1351 579 9 6 13 72
1 0 0 24768 3780 351968 0 0 0 0 1073 55 1 1 98 0

PHP preg_match strings in between quotations for entire file

I've created a php file that I'm trying to parse data with. The file's content that I'd like to parse looks like this:
[Titles]
hollywoodhd1 1 0 8046 0 919 PG-13 6712 1 identity_hd "(HD) Identity Thief" Disk 0 04/15/13 11/01/13 0 0 0 0 0 0 1 1 0 16000000 H3 16:9 0 0
hollywoodhd2 3 0 8016 0 930 PG 5347 1 escapep_hd "(HD) Escape from Planet Earth" Disk 0 04/01/13 10/01/13 0 0 0 0 0 0 1 1 0 16000000 H3 16:9 0 0
hollywoodhd3 1 0 8012 0 930 PG-13 5828 1 darkski_hd "(HD) Dark Skies" Disk 0 04/01/13 10/01/13 0 0 0 0 0 0 1 1 0 16000000 H3 16:9 0 0
The PHP that i've created:
<?php
foreach (glob("*.mov") as $filename)
$theData = file_get_contents($filename) or die("Unable to retrieve file data");
// echo nl2br($theData); //- This will print the entire text-wrapped line breaks.
$Ratings = ['G', 'PG', 'PG-13', 'R', 'NR', 'XXX']; // - This doesn't do anything yet.
if (preg_match('!"([^"]+)"!', $theData, $m)){
echo $m[1];
}
?>
QUESTION: I'd like to return the MOVIE TITLE : RATING, but the code that I have only returns a single movie title so far, (HD) Identity Thief. I have a long way to go so any guidance would be greatly appreciated. Is there a way to sort Movie Title : Rating, strictly by the Ratings?
Also, is there a way to have the PHP script search a directory, and sub folders for any ".mov" file extension, run the script for each file?

Here is what I have so far, I haven't gotten the sorting part, since you don't say how you want it sorted...
<?php
header("content-type: text/plain");
function getInfo($string){
$Ratings = ['G', 'PG', 'PG-13', 'R', 'NR', 'XXX']; // Used in the loop below
$split = preg_split("/\"(.+)\"/", $string, -1, PREG_SPLIT_DELIM_CAPTURE);
$string = $split[1];
preg_match("/(".implode("|", $Ratings).")\s/", $split[0], $matches);
$rating = $matches[0];
return ["title" => $split[1], "rating" => $rating];
}
$string = <<<String
[Titles]
hollywoodhd1 1 0 8046 0 919 PG-13 6712 1 identity_hd "(HD) Identity Thief" Disk 0 04/15/13 11/01/13 0 0 0 0 0 0 1 1 0 16000000 H3 16:9 0 0
hollywoodhd2 3 0 8016 0 930 PG 5347 1 escapep_hd "(HD) Escape from Planet Earth" Disk 0 04/01/13 10/01/13 0 0 0 0 0 0 1 1 0 16000000 H3 16:9 0 0
hollywoodhd3 1 0 8012 0 930 PG-13 5828 1 darkski_hd "(HD) Dark Skies" Disk 0 04/01/13 10/01/13 0 0 0 0 0 0 1 1 0 16000000 H3 16:9 0 0
String;
$titles = explode("\n", $string);
// Remove the first line
unset($titles[0]);
foreach($titles as $title){
$info = getInfo($title);
echo "{$info["title"]} : {$info["rating"]}\n";
}
Then, here is the output that is returned:
(HD) Identity Thief : PG-13
(HD) Escape from Planet Earth : PG
(HD) Dark Skies : PG-13

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Formatting raw text gathered from another website using php - php

Check out PHP Simple HTML DOM Parser It works brilliantly for this stuff. http://simplehtmldom.sourceforge.net/

Related

Find and Replace a Column in CSV1 based on CSV2 in PHP

How do I make my word unscrambler return more relevant results

To Convert an unstructured string to 2d array om PHP

How can I monitor memory consumption during webserver stress test?

PHP preg_match strings in between quotations for entire file

Categories

Resources