I have a file that contains many lines such as these:
2011-03-23 10:11:08 34 57 2 25,5 -
2011-03-23 10:11:12 67 54 3 3,5 -
2011-03-23 10:11:16 76 57 3 2,4 -
2011-03-23 10:11:18 39 41 2 25,5 +
Each line ends with + or -. I'd like the file content to be split after + or - sign. Lines doesn't have same number of characters.
I was trying to read the file using fgets() with auto_detect_line_endings on, but there were still many lines combined into single one:
Output example:
Output should be two lines but there is only one (you can see the "the new line" but PHP doesn't):
2011-03-23 10:11:08 34 57 2 25,5 -
2011-03-23 10:11:12 67 54 3 3,5 -
EDIT:
Code I am using to read the file
ini_set('auto_detect_line_endings', true);
$handle = fopen($filename, "r");
$index = 1;
if ($handle) {
while (($line = fgets($handle)) !== false) {
if (trim($line) != '') {
$data = preg_split('/\s+/', trim($line));
// Saving into DB...
$index++;
}
}
}
fclose($handle);
To make sure you get all of the possible new line combinations you should use preg_split instead:
LF = \n, CR = \r
LF: Multics, Unix and Unix-like systems (GNU/Linux, OS X, FreeBSD, AIX, Xenix, etc.), BeOS, Amiga, RISC OS and others.
CR: Commodore 8-bit machines, Acorn BBC, ZX Spectrum, TRS-80, Apple II family, Mac OS up to version 9 and OS-9
LF+CR: Acorn BBC and RISC OS spooled text output.
CR+LF: Microsoft Windows, DEC TOPS-10, RT-11 and most other early non-Unix and non-IBM OSes, CP/M, MP/M, DOS (MS-DOS, PC DOS, etc.), Atari TOS, OS/2, Symbian OS, Palm OS, Amstrad CPC
The regex would be /(\r\n|\n\r|\n|\r)/ (CR+LF or LF+CR or LF or CR):
$lines = preg_split('/(\r\n|\n\r|\n|\r)/', $string);
DEMO
If you plan on not having any empty lines (lines with white space count as empty) you can add an optional \s* to the end of your regex which will match 0 to an infinite amount of white spaces after your newlines:
$lines = preg_split('/(\r\n|\n\r|\n|\r)\s*/', $string);
DEMO
If you plan on not having any empty lines, but expect lines with white space to not count as empty, you can even simplify the regex:
$lines = preg_split('/[\n\r]+/', $string);
DEMO
TRY THIS:
<?php
$input = "2011-03-23 10:11:08 34 57 2 25,5 -
2011-03-23 10:11:12 67 54 3 3,5 -
2011-03-23 10:11:16 76 57 3 2,4 -
2011-03-23 10:11:18 39 41 2 25,5 +";
// 1st explode by new line
$output = explode("\n", $input);
print_r($output);
// 2nd remove last character
$result = array();
foreach($output as $op)
{
$result[] = substr($op, 0, -1);
}
print_r($result);
OUTPUT:
Array
(
[0] => 2011-03-23 10:11:08 34 57 2 25,5 -
[1] => 2011-03-23 10:11:12 67 54 3 3,5 -
[2] => 2011-03-23 10:11:16 76 57 3 2,4 -
[3] => 2011-03-23 10:11:18 39 41 2 25,5 +
)
Array
(
[0] => 2011-03-23 10:11:08 34 57 2 25,5
[1] => 2011-03-23 10:11:12 67 54 3 3,5
[2] => 2011-03-23 10:11:16 76 57 3 2,4
[3] => 2011-03-23 10:11:18 39 41 2 25,5
)
DEMO:
http://3v4l.org/0uIe7#v430
Related
I'm doing social service, the idea is to automatize a process. I already capture other values using regex, like:
<?php function LeerEncabezado() {
$fh = fopen('RPREGFM_________007_001.txt', 'r') or die('lel'); $file =
str_replace(',', '',
file_get_contents("RPREGFM_________007_001.txt")); $f =
fopen("RPREGFM_________007_001.txt", "w"); fwrite($f, $file);
fclose($f); while (!feof($fh)) {
$line = fgets($fh);
//Intermedio
if (preg_match('/INTERMEDIO (?<cfintermedio>[\w]+.+)/i', $line, $r1)) {
$CFINTERMEDIO = substr($r1['cfintermedio'], 13, 8);
echo "C.F. Intermedio: $CFINTERMEDIO";
echo '<br/>';
}
//Base
if (preg_match('/BASE (?<cfbase>[\w]+.+)/i', $line, $r2)) {
$CFBASE = substr($r2['cfbase'], 13, 8);
echo "C.F. Base: $CFBASE";
echo '<br/>';
}
//kVArh
if (preg_match('/F.P. (?<fp>[\d]+.+)/i', $line, $r3)) {
$anioi = substr($r3['fp'], 0, 5);
echo "kVArh: $anioi";
echo '<br/>';
echo "--------------------------------------</br>";
}
//Base promedio
if (preg_match('/201912 (?<fp>[\d]+.+)/i', $line, $r3)) {
$anioi = substr($r3['fp'], 8, 5); // echo "kVArh: $anioi";
echo '<br/>';
echo "--------------------------------------</br>";
}
} } LeerEncabezado();
fclose($fh);
And it's working until his point, I got to take 3 values from that text block, the last number in INTERM column, BASE and % M$. In this receipt, it would be 1,5,99.98%. I've used strg_replace before for, I guess ill use it for the %. So I can compare those variables with other in the same receipt.
How can I extract those value taking in mind that the 201912 it's going to change taking in consideration the year and the number of rows in MES might change? Pretty much used the blank space and the read the previous line which it's going to be the one where the 3 values are in?
MES TOTAL PUNTA INTERM BASE TOT PTA INT BAS % M$
201901 9 1 7 744 122 622 99.96
201902 8 1 6 672 107 565 99.97
201903 9 1 7 744 115 629 99.97
201904 9 2 7 719 122 597 99.97
201905 10 1 8 744 88 656 99.98
201906 10 1 8 720 80 640 99.97
201907 12 2 10 744 92 652 98.89
201908 13 2 11 744 88 656 97.74
201909 11 1 9 720 80 640 97.05
201910t 1 7 624 76 548 97.56
201910 10 1 120 20 100 98.80
201911 8 1 6 721 115 606 99.20
201912 7 1 5 744 117 627 99.98
I want to parse a mobile number without special character for example
+61-426 861 479 ====> 61 426 861 479
PHP preg_match_all
preg_match_all('/(\d{2}) (\d{3}) (\d{3}) (\d{3})/', $part,$matches);
if (count($matches[0])){
foreach ($matches[0] as $mob) {
$records['mobile'][] = $mob;
}
}
Expected Output
+61-426 861 479 ====> 61 426 861 479
You are missing the + and the - in your pattern. You might update your pattern to use 2 capturing groups and use preg_match_all. To add the mobile number to the array you could concatenate the first and the second index.
\+(\d{2})-(\d{3}(?: \d{3}){2})\b
Regex demo | Php demo
For example
$part = "+61-426 861 478 +61-426 861 479 ";
preg_match_all('/\+(\d{2})-(\d{3}(?: \d{3}){2})\b/', $part, $matches, PREG_SET_ORDER, 0);
if (count($matches)) {
foreach ($matches as $mob) {
$records['mobile'][] = $mob[1] . ' ' . $mob[2];
}
}
print_r($records);
Result
Array
(
[mobile] => Array
(
[0] => 61 426 861 478
[1] => 61 426 861 479
)
)
If the number is the only string, you might also remove all the non digits using \D+ and replace with a space. Then use ltrim to remove the leading space from the +. See a php demo.
I have a string and I want to match a specific pattern optionally as many times as may occur.
My String
0.91 0.45 0.69 58 47 45 23 83 90 $595 NO IDL
After 45 until $595 There could be upto 6 more number there. How can I optionally look for repeating number in that space?
Here's what I have so far:
/([\d.]+) ([\d.]+) ([\d.]+)? (\d+) (\d+) (\d+) \$(\d+)/ig
Here are some samples with expected outputs:
0.91 0.45 0.69 58 47 45 23 83 90 $595 NO IDL
output: array([0] => 0.91,
[1] => 0.45,
[2] => 0.69,
[3] => 58,
[4] => 47,
[5] => 45,
[6] => 23,
[7] => 83,
[8] => 90,
[9] => 595)
0.91 0.45 0.69 58 47 45 $595 NO IDL
output: array([0] => 0.91,
[1] => 0.45,
[2] => 0.69,
[3] => 58,
[4] => 47,
[5] => 45,
[5] => 595)
0.91 0.45 0.69 0.63 58 47 45 $595 NO IDL
output: Does not match the pattern because we only want 3 of the first items to contain decimals.
This seems to split the last number into multiple numbers. Can't figure out whats going on.
I am using php preg_match method for this so would like not empty elements in the resulting array if possible. Thanks.
You may validate the string with a positive lookahead triggered at the start of the string, and then match all numbers from the start up to the currency value once the validation succeeds:
'~(?:\G(?!^)|^(?=\d+\.\d+ \d+\.\d+ \d+(?:\.\d+)?(?: \d+)* \$\d))\s*\$?\K\d+(?:\.\d+)?~'
See the regex demo
Details
(?:\G(?!^)|^(?=\d+\.\d+ \d+\.\d+ \d+(?:\.\d+)?(?: \d+)* \$\d)) - either the end of the previous match (\G(?!^)) or start of a string (^) that is followed with
\d+\.\d+
- a space
\d+\.\d+
- a space
\d+ - 1+ digits
(?:\.\d+)? - an optional fractional part
(?: \d+)* - 0+ sequences of a space followed with 1+ digits
- space
\$\d - a $ and a digit.
\s* - 0+ whitespaces
\$? - an optional $ char
\K - match reset operator
\d+(?:\.\d+)? - an int/float number (1+ digits followed with an optional sequence of . and 1+ digits).
PHP demo:
$strs = ['0.91 0.45 0.69 58 47 45 23 83 90 $595 NO IDL','0.91 0.45 0.69 58 47 45 $595 NO IDL','0.91 0.45 0.69 0.63 58 47 45 $595 NO IDL'];
$rx = '~(?:\G(?!^)|^(?=\d+\.\d+ \d+\.\d+ \d+(?:\.\d+)?(?: \d+)* \$\d))\s*\$?\K\d+(?:\.\d+)?~';
foreach ($strs as $s) {
echo "$s:\n";
if (preg_match_all($rx, $s, $matches)) {
print_r($matches[0]);
echo "---------\n";
} else {
echo "NO MATCH!!!\n---------\n";
}
}
Output:
0.91 0.45 0.69 58 47 45 23 83 90 $595 NO IDL:
Array
(
[0] => 0.91
[1] => 0.45
[2] => 0.69
[3] => 58
[4] => 47
[5] => 45
[6] => 23
[7] => 83
[8] => 90
[9] => 595
)
---------
0.91 0.45 0.69 58 47 45 $595 NO IDL:
Array
(
[0] => 0.91
[1] => 0.45
[2] => 0.69
[3] => 58
[4] => 47
[5] => 45
[6] => 595
)
---------
0.91 0.45 0.69 0.63 58 47 45 $595 NO IDL:
NO MATCH!!!
---------
This should give you the expected results:
/([\d\$.]+)/ig
You might repeat the amount of numbers until you matched 45 which is the 6th number.
Explanation
(?:\d+\.\d+)(?: \d+\.\d+){2} Match the number at the start (digit with an decimal part) 3 times
(?: \d+){3} Match a digit with a whitespace 3 times. That will match up till 45
\s* Match zero or more whitespace characters
| Or
\G(?!^) Assert the position at the end of the previous match using a negative lookahead to assert not start of the string
(\d+)\s Capture the digits and match the whitespace in a capturing group
(?:\d+\.\d+)(?: \d+\.\d+){2}(?: \d+){3}\s*|\G(?!^)(\d+)\s
Regex demo
For example a demo to extract the 3 digits after 45:
Demo
I have looked for a year to try to figure this one out. I am trying to build a bracket running system, for running bowling brackets.
I have a table with an ID column and a BowlerID column, call it bowling_bracket_entries. The ID is unique, but there can be multiple entries of the same BowlerID, ranging from 8 to 1 entry. What I want to do is make pairs from the BowlerID row, but never repeat the same pair, then from those pairings, put them in groups of 4 pairs where no BowlerID repeats within that group of 4 pairings.
Structure of the bowling.bracket_entries table
ID | BowlerID
766 151
767 230
768 201
769 202
770 140
771 205
772 62
773 75
774 56
775 140
759 129
760 60
761 165
762 223
763 145
764 131
765 145
704 197
705 230
706 202
707 167
708 223
709 205
710 217
711 217
712 56
713 60
714 141
715 60
716 193
717 181
718 217
719 75
720 218
721 151
722 223
723 202
724 197
725 140
726 220
727 203
728 56
729 62
730 218
731 160
732 205
733 141
734 167
735 165
736 151
737 205
738 224
739 203
740 142
741 181
742 60
743 60
744 218
745 217
746 224
747 160
748 218
749 223
750 203
751 193
752 202
753 62
754 60
755 142
756 201
757 151
758 203
I tried randomly selecting 2 BowlerID's and putting them together with a delimiter (ie 22~100), then inserting into a Pairings table, then pull the next pairing (ie 36~92), create a variable reverse of that pair (ie 92~36), and check the Pairing table for values that match either, if not found, it inserts, removes the ID of those BowlerIDs from the Entries table and repeats until it runs out of values. Problem is sometimes I get a BowlerID paired with itself. Occasionally, I will get a complete list with no BowlerID's paired with themselves.
SELECT bracket_entries.ID, bracket_entries.BowlerID FROM bracket_entries ORDER BY rand() LIMIT 2
Then put them together and create a pairing (ie 36~68)
$i = 0;
while($pairing=$rsNewPair->fetch_assoc()) {
//Build Pairing List
$thisPairing .= $pairing['BowlerID'];
$IDS .= $pairing['ID'];
$i++;
if($i < 2){
$thisPairing .= "~";
$IDS .= "~";
}
}
$flipFlop = explode('~', $thisPairing);
$reversePairing = $flipFlop[1].'~'.$flipFlop[0];
if($flipFlop[0] == $flipFlop[1]){
header("Refresh:0");
}
And compare to what is already in there.
SELECT bracket_pairings.Pairing FROM bracket_pairings WHERE bracket_pairings.Pairing = '".$thisPairing."' OR bracket_pairings.Pairing = '".$reversePairing."'"
If it doesn't find anything, then insert the pairing into the Pairings table and move on to the next 2
bowling_bracket_pairings table structure
1 203~218
2 193~218
3 217~129
4 201~60
5 60~141
6 141~165
7 197~202
8 230~203
9 220~167
10 60~62
11 151~140
12 151~230
13 193~205
14 60~140
15 217~223
16 203~142
17 60~205
18 197~151
19 205~201
20 218~62
21 56~223
22 217~167
23 56~202
24 217~75
25 224~223
26 160~203
27 151~60
28 131~145
29 140~205
30 202~75
31 62~160
32 142~181
33 224~181
34 145~223
35 165~56
36 218~202
SELECT
PairingID, SUBSTRING_INDEX(Pairing, '~', 1) AS entry1,
SUBSTRING_INDEX(SUBSTRING_INDEX(Pairing, '~', 2), '~', -1) AS entry2
FROM bracket_pairings
Then use a while loop to display the pairings in brackets and push each entry into an array for the 4 pairs until it is full and then compare to make sure any user is not duplicated.
while(($pairings=$rsEntries->fetch_assoc())&&($loop < 5)){
$thisBowlerID1 = $pairings['entry1'];
$thisBowlerID2 = $pairings['entry2'];
if((!in_array($thisBowlerID1, $thisBracket)) || (!in_array($thisBowlerID2, $thisBracket))){
while($players=$rsPlayers->fetch_assoc()){
if($players['BowlerID'] == $thisBowlerID1){
echo $players['BowlerID'].'<br>';
//echo $players['Name'].'('.$players['CurrentAvg'].')<br>';
}
} mysqli_data_seek($rsPlayers, 0);
array_push($thisBracket, $thisBowlerID1);
while($players=$rsPlayers->fetch_assoc()){
if($players['BowlerID'] == $thisBowlerID2){
echo $players['BowlerID'].'<br><br>';
//echo $players['Name'].'('.$players['CurrentAvg'].')<br><br>';
}
} mysqli_data_seek($rsPlayers, 0);
array_push($thisBracket, $thisBowlerID2);
$removeSQL="DELETE FROM bracket_pairings WHERE bracket_pairings.PairingID = ".$pairings['PairingID'];
$removePairing = $connAdmin->query($removeSQL);
$loop++;
}
$thisBracket = array();
}
}
I have 72 entries When I try to put them in groups of 4 (8 entries), It never seems to fill up the 9 brackets, just about 7.5 and then leave a random assortment of pairings left in the table that didn't get placed, yet I still have openings.
Result
Bracket 1
62
141
142
151
131
218
140
56
Bracket 2
145
201
193
160
56
205
129
203
Bracket 3
167
75
217
201
224
217
230
140
Bracket 4
60
193
203
197
141
167
223
220
Bracket 5
60
165
202
142
181
60
202
202
Bracket 6
205
140
62
218
217
60
230
223
Bracket 7
165
223
205
218
205
75
56
151
Bracket 8
202
203
As you can see the result leave 8 unfilled.
Here is what is left over that didn't get included:
5 197~218
10 60~223
15 181~62
20 203~60
25 160~217
30 151~151
35 145~224
Not sure why every fifth one has skipped. I think I am on the right track, but any help or ideas to figure out how to fix the issues that I am having would be great.
Okay, first: Don't save the pairings as a string like "203~60". It makes it harder to work with the database when you have to combine/split the values all the time. Your tables should be in 3NF.
Second: Don't save the pairings in the database when you are still building the pairings. Keep them in the memory of your php to avoid any unnecessary database calls just to see if the pairing is already added, it is much faster that way.
That being said, there are some algorithms you can lookup for your problem. You should check the following links:
Is there a known algorithm for scheduling tournament matchups? an their related questions on softwareengineering.stackexchange.com (this might be even a better place to ask, but check for duplicates)
https://en.wikipedia.org/wiki/Matching_%28graph_theory%29
https://en.wikipedia.org/wiki/Round-robin_tournament#Scheduling_algorithm
https://en.wikipedia.org/wiki/Backtracking
I can think of some algorithm, but it fails in some situations. The algorithm would work like this:
You use the algorithm on https://en.wikipedia.org/wiki/Round-robin_tournament#Scheduling_algorithm to create a scheduling for 9 teams with 8 members each. Let assume they are called "a" to "i". The pairings will look like this:
abcd aibc ahib aghi afgh
hgfe gfed fedc edcb dcbi
aefg adef acde bcfg
cbih bihg ihgf dehi
You get this seeding by holding the "a" team in place and rotate the remaining teams around the table/pairings. However you have to skip one team since you have 9 teams for 4*2 possible seeds. In the ninth group the "a" team is missing and it contains the remaining seedings of "b" to "i".
When we have these 9 teams with 8 members each they could be represented as this:
aaaaaaaa
bbbbbbbb
cccccccc
dddddddd
eeeeeeee
ffffffff
gggggggg
hhhhhhhh
iiiiiiii
When you have more than 9 teams you should try to pair them together like they belong together to one pseudo-team of size 8. This can be looked like this:
aaaaaabb
ccccddde
fffffggg
hhiijjjj
kkkkklll
mmmmnnnn
ooopppqq
rrrrrsss
ttuuvvww
Since these teams would be on "the same" pseudo team, they don't match against each other and the algorithm still works.
However, the algorithm fail when you cannot put the teams in pseudo teams of size 8. Assume you have 2 teams of 8 members and 8 teams of size 7. The
pseudo teams would look like this:
aaaaaaaa
bbbbbbbb
cccccccj
dddddddj
eeeeeeej
fffffffj
gggggggj
hhhhhhhj
iiiiiiij
In this situation, eventually the "8th" player of the row "c" might play against the "8th" player of the row "d", but they are actually on the same team. You might try to be tricky to move the "8th" player of the row "c" to a different place in the "c" row. But when you are on this road of fixing, you can use a backtracking algorithm instead anyway.
By backtracking you brute force all the combinations and skip a combination when you found that the solution doesn't work. Check the URL above to understand backtracking (the animated gif might be helpful).
In summary I am using stream_get_line to read a line of a file, replace a string and then write the line to another file.
I am using stream_get_line and supplying the "ending" parameter to instruct the function to read lines, or if there is no new line then read 130 bytes.
What I would like to know is how can I know if the 3rd parameter (PHP_EOL) was found, as I need to write exactly the same line (except for my string replacement) to the new file.
For reference...
string stream_get_line ( resource $handle , int $length [, string $ending ] )
It's mainly needed for the last line, sometimes it will contain a newline character and sometimes it doesn't.
My initial idea is to seek to the last line of the file and search the line for a new line character to see if I need to attach a newline to my edited line or not.
You could try using fgets if the stream is in ASCII mode (which only matters on Windows). That function will include the newline if it is found:
$line = fgets(STDIN, 131);
Otherwise, you could use ftell to see how many bytes were read and thus determine whether there was a line ending. For example, if foo.php contains
<?php
while (!feof(STDIN)) {
$pos = ftell(STDIN);
$line = stream_get_line(STDIN, 74, "\n");
$ended = (bool)(ftell(STDIN) - strlen($line) - $pos);
echo ($ended ? "YES " : "NO ") . $line . "\n";
}
executing echo -ne {1..100} '\n2nd to last line\nlast line' | php foo.php will give this output:
NO 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
NO 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 5
NO 3 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77
YES 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
YES 2nd to last line
NO last line