In PHP, I am importing some text files containing tables of float values that are space delimited. All values contain two decimal places. A typical line would look like this:
1.45 22.87 99.12 19.55
However, some lines, if the number before the decimal is 3 digits long, the original file sometimes does not include a space. So what should be:
1.45 122.87 99.12 19.55
comes in as:
1.45122.87 99.12 19.55
What I assume I need to do is search the string for decimals, then look 2 spaces after that, and if there is not a space there I need to add one. I just cannot for the life of me figure out the most direct way to do so.
I would use regex:
$pattern = "/(-)?\d{1,}\.\d{2}/";
preg_match_all($pattern, "1.45122.87 99.12 19.55", $matches);
print_r($matches);
DEMO
This does what you want. Probably not the most efficient way to do it though.
<?php
$line = "1.45122.87 99.12 19.55";
$length = strlen($line);
$result = '';
$i=0;
while ($i<$length)
{
if ($line[$i] == '.')
{
$result .= $line[$i];
$result .= $line[$i+1];
$result .= $line[$i+2];
$result .= ' ';
$i += 3;
}
else if ($line[$i] == ' ')
{
$i++;
}
else
{
$result .= $line[$i];
$i++;
}
}
echo $result;
?>
This is a fixed-column-width file. I would parse these by substr().
http://php.net/manual/en/function.substr.php
for ($x=0; $x<strlen($line); $x+=4) {
$parts[] = trim(substr($line, $x, 4));
}
This will get you an array in $parts of all the fields. This is untested, but should work.
$line = '1.45122.87 99.12 19.55';
preg_match_all('~([0-9]{1,3}\.[0-9]{2})~', $line, $matches);
var_dump($matches[1]);
/*
Result:
array(4) {
[0]=>
string(4) "1.45"
[1]=>
string(6) "122.87"
[2]=>
string(5) "99.12"
[3]=>
string(5) "19.55"
}
*/
you could use preg_split() to create an array of the line using regex
$lineArray = preg_split("/(\d+(\.\d{1,2})?)/", $lineOfNumbers);
This would find all instances of ####.## and not worry about the spaces
I would do something like this, say the line of decimals is in a variable called $line:
$parts = explode(' ', $line);
Now you have an array of decimals values, so
$parts[0] = "1.45"
(float)$parts[0] = 1.45
$parts[1] = "122.87"
(float)$parts[1] = 122.87
// etc...
Related
I'm working with a string containing parameters, separated by some special characters in PHP with preg_match
An example could be like this one, which has four parameters.
1stparm?#?1111?#?2ndParm?#?2222?#?3rdParm?#?3333?#?4thparm?#?444?#?
Each parameter name is followed by ?#?, and its value is right next to it, ending with ?#? (note: values can be strings or numbers, and even special characters)
I've probably overcomplicated my regex, which works in SOME cases, but not if I search for the last parameter in the string..
This example returns 2222 as the correct value (in group 1) for 2ndParm
(?:.*)2ndParm\?#\?(.*?)\?#\?(?=.)(.*)
but it fails if 2ndParm is the last one in the string as in the following example:
1stparm?#?1111?#?2ndParm?#?2222?#?
I'd also appreciate help in just returning one group with my result.. i havent been able to do so, but since I always get the one I'm interested in group 1, I can get it easily anyway.
Without regex:
$str ='1stparm?#?1111?#?2ndParm?#?2222?#?3rdParm?#?3333?#?4thparm?#?444?#?';
$keyval = explode('?#?', trim($str, '?#'));
$result = [];
foreach($keyval as $item) {
[$key, $result[$key]] = explode('?#?', $item);
}
print_r($result);
demo
You don't need to use a regex for everything, and you should have a serious talk with whoever invented this horrid format about the fact that JSON, YAML, TOML, XML, etc exist.
function bizarre_unserialize($in) {
$tmp = explode('?#?', $in);
$tmp = array_filter($tmp); // remove empty
$tmp = array_map(
function($a) { return explode('?#?', $a); },
$tmp
);
// rearrange to key-value
return array_combine(array_column($tmp, 0), array_column($tmp, 1));
}
$input = '1stparm?#?1111?#?2ndParm?#?2222?#?3rdParm?#?3333?#?4thparm?#?444?#?';
var_dump(
bizarre_unserialize($input)
);
Output:
array(4) {
["1stparm"]=>
string(4) "1111"
["2ndParm"]=>
string(4) "2222"
["3rdParm"]=>
string(4) "3333"
["4thparm"]=>
string(3) "444"
}
You can use
(?P<key>.+?)
\Q?#?\E
(?P<value>.+?)
\Q?#?\E
in verbose mode, see a demo on regex101.com.
The \Q...\E construct disables the ? and # "super-powers" (no need to escape them here).
In PHP this could be
<?php
$string = "1stparm?#?1111?#?2ndParm?#?2222?#?3rdParm?#?3333?#?4thparm?#?444?#?";
$regex = "~(?P<key>.+?)\Q?#?\E(?P<value>.+?)\Q?#?\E~";
preg_match_all($regex, $string, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
echo $match["key"] . " = " . $match["value"] . "\n";
}
?>
Which yields
1stparm = 1111
2ndParm = 2222
3rdParm = 3333
4thparm = 444
Or shorter:
$result = array_map(
function($x) {return array($x["key"] => $x["value"]);}, $matches);
print_r($result);
I have this string:
$str = "11ff11
22mm22
33gg33
mm22mm
vv55vv
77ll77
55kk55
kk22kk
bb11bb";
There is two kind of patterns:
{two numbers}{two letters}{two numbers}
{two letters}{two numbers}{two letters}
I'm trying to match the first line when pattern changes. So I want to match these:
11ff11 -- this
22mm22
33gg33
mm22mm -- this
vv55vv
77ll77 -- this
55kk55
kk22kk -- this
bb11bb
Here is my current pattern:
/(\d{2}[a-z]{2}\d{2})|([a-z]{2}\d{2}[a-z]{2})/
But it matches all lines ..! How can I limit it to match just first line of same pattern?
I could not do it with lookaround due to the problem with spaces. But with classic regex it's available. It finds sequences of repeating pattern and capture only he first one
(?:(\d{2}[a-z]{2}\d{2})\s+)(?:\d{2}[a-z]{2}\d{2}\s+)*|(?:([a-z]{2}\d{2}[a-z]{2})\s+)(?:[a-z]{2}\d{2}[a-z]{2}\s+)*
demo and some explanation
To understand how it works i made simple exmple with patterns of digit and letter:
(?:(\d)\s+)(?:\d\s+)*|(?:(a)\s+)(?:a\s+)*
demo and some explanation
Not sure if you can do this with only one expression, but you can iterate over your string and test when changes:
<?php
$str = "11ff11
22mm22
33gg33
mm22mm
vv55vv
77ll77
55kk55
kk22kk
bb11bb";
$exploded = explode(PHP_EOL, $str);
$patternA = '/(\d{2}[a-z]{2}\d{2})/';
$patternB = '/([a-z]{2}\d{2}[a-z]{2})/';
$result = [];
$currentPattern = '';
//get first and check what pattern is
if(preg_match($patternA, $exploded[0])){
$currentPattern = $patternA;
$result[] = $exploded[0];
} elseif(preg_match($patternB, $exploded[0])){
$currentPattern = $patternB;
$result[] = $exploded[0];
} else {
//.. no pattern on first element, should we continue?
}
//toggle
$currentPattern = $currentPattern == $patternA ? $patternB : $patternA;
foreach($exploded as $e) {
if(preg_match($currentPattern, $e)) {
//toggle
$currentPattern = $currentPattern == $patternA ? $patternB : $patternA;
$result[] = trim($e);
}
}
echo "<pre>";
var_dump($result);
echo "</pre>";
Output:
array(4) {
[0]=>
string(6) "11ff11"
[1]=>
string(6) "mm22mm"
[2]=>
string(6) "77ll77"
[3]=>
string(6) "kk22kk"
}
Here's my take. Never used lookbehinds before and well, my regex skills are not that good but this does seem to return what you want.
/^.*|(?<=[a-z]{2}\n)\d{2}[a-z]{2}\d{2}|(?<=\d{2}\n)[a-z]{2}\d{2}[a-z]{2}/
I want to grab a text with PHP just like for an example, There is a data "The apple=10" and I want to grab only the numbers from the data which looks exactly like that. I mean, the number's place would be after 'equals'.
and my problem is that the number from the source can be 2 or 3 characters or on the other word it is inconstant.
please help me to solve them :)
$string = "Apple=10 | Orange=3 | Banana=7";
$elements = explode("|", $string);
$values = array();
foreach($elements as $element)
{
$element = trim($element);
$val_array = explode("=", $element);
$values[$val_array[0]] = $val_array[1];
}
var_dump($values);
Output:
array(3) {
["Apple"]=> string(2) "10"
["Orange"]=> string(1) "3"
["Banana"]=> string(1) "7"
}
Hope thats how you need it :)
Well, php is a bit lazy about int conversion, so 12345blablabla can be converted to 12345:
$value = intval(substr($str, strpos($str, '=') + 1));
Of course, this is not the cleanest way but it is simple. If you want something cleaner, you could use a regexp:
preg_match ('#=([0-9]+)#', $str, $matches);
$value = intval($matches[1]) ;
Try the below code:
$givenString= "The apple=10";
$required_string = substr($givenString, strpos($givenString, "=") + 1);
echo "output = ".$required_string ; // output = 10
Using strpos() function, you can Find the position of the first occurrence of a substring in a string
and substr() function, Return part of a string.
What is an easy way to take a string that is formatted this way:
c:7|bn:99
and be able to use that string easily? So if I wanted to get the number that is behind the c:, How could I get that easily. Same, thing with the number behind bn:?
You could use preg_match() function or you could use explode() function twice (first with | delimiter and second with : delimiter).
Example #1:
<?php
if( preg_match( '/^c:(\d+)\|bn:(\d+)$/', $sString, $aMatches ) )
{
print_r( $aMatches );
}
?>
Example #2:
<?php
$aPairs = explode('|', $sString ); // you have two elements in $aPairs
foreach( $aParis as $sPair )
{
print_r( explode(':', $sPair ) );
}
?>
$arr = array();
$str = "c:7|bn:99";
$tmp1 = explode('|', $str);
foreach($tmp1 as $val)
{
$tmp2 = explode(':', $val);
$arr[$tmp2[0]] = $tmp2[1];
}
//print ur array
print_r($arr);
//accessing specifc value
echo $arr['c']." ".$arr['bn'];
Try this:
$string = 'c:7|bn:99';
preg_match('/\Ac:([0-9]+)\|bn:([0-9]+)\z/', $string, $matches);
var_dump($matches);
If c & bn are not dynamic:
var_dump(sscanf("c:7|bn:99","c:%d|bn:%d"));
array(2) {
[0]=>
int(7)
[1]=>
int(99)
}
I have several strings of the format
AA11
AAAAAA1111111
AA1111111
Which is the best (most efficient) way to separate the alphabetic and numeric components of the string?
If they're all a series of alpha, followed by a series of numeric, with no non-alphameric characters, then sscanf() is probably more efficient than regexp
$example = 'AAA11111';
list($alpha,$numeric) = sscanf($example, "%[A-Z]%d");
var_dump($alpha);
var_dump($numeric);
preg_split should do the job fine.
preg_split('/(\w+)/', $input, -1, PREG_SPLIT_DELIM_CAPTURE);
The preg library is surprisingly efficient in handling strings, so I would assume it to be more efficient than anything you can write by hand, using more primitive string functions. But do a test and see for your self.
Here is a working example using preg_split():
$strs = array( 'AA11', 'AAAAAA1111111', 'AA1111111');
foreach( $strs as $str)
foreach( preg_split( '/([A-Za-z]+)/', $str, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY) as $temp)
var_dump( $temp);
This outputs:
string(2) "AA"
string(2) "11"
string(6) "AAAAAA"
string(7) "1111111"
string(2) "AA"
string(7) "1111111"
Instead of using RegEx straight away you can add one extra check for example:
if (ctype_alpha($testcase)) {
// Return the value it's only letters
} else if(ctype_digit($testcase)) {
// Return the value it's only numbers
} else {
//RegEx your string to split nums and alphas
}
EDIT: Obviously my answer didn't give an evidence which will perform better, that's why I did a test that produced the following result:
preg_split took 5.3319189548492 seconds
sscanf took 3.4432129859924 seconds
And the answer should have been sscanf
Here's the code that produced the result:
$string = "AAAAAAAAAA111111111111111";
$count = 1000000;
function prSplit($string) {
return preg_split( '/([A-Za-z]+)/', $string, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
}
function sScanfTest($string) {
return sscanf($string, "%[A-Z]%[0-9]");
}
function microtime_float()
{
list($usec, $sec) = explode(" ", microtime());
return ((float)$usec + (float)$sec);
}
$startTime1 = microtime_float();
for($i=0; $i<$count; ++$i) {
prSplit($string);
}
$time1 = microtime_float() - $startTime1;
echo '1. preg_split took '.$time1.' seconds<br />';
$startTime2 = microtime_float();
for($i=0; $i<$count; ++$i) {
sScanfTest($string);
}
$time2 = microtime_float() - $startTime2;
echo '2. sscanf took '.$time2.' seconds';
This seems to work but when you try to pass something like "111111", it doesn't.
In my application, I am expecting several scenarios and what seems to be doing the trick is this
$referenceNumber = "AAA12132";
$splited = preg_split('/(\d+)/', $referenceNumber, -1, PREG_SPLIT_DELIM_CAPTURE);
var_dump($splited);
Note:
Getting an array of 2 elements, it means the 0th index is the alpha while the 1st is the numerics.
Getting array of just 1 element, means the 0th element is the numeric and no alphas.
If you get more than 2 array items, then your string must be in this format “AAA1323SDC”
So given the above, you can play around with it based on your use case.
Cheers!