Regex Help with manipulating string - php

i am seriously struggling to get my head around regex.
I have a sring with "iPhone: 52.973053,-0.021447"
i want to extract the two numbers after the colon into two seperate strings so delimited by the comma.
Can anyone help me? Cheers

Try:
preg_match_all('/\w+:\s*(-?\d+\.\d+),(-?\d+\.\d+)/',
"iPhone: 52.973053,-0.021447 FOO: -1.0,-1.0",
$matches, PREG_SET_ORDER);
print_r($matches);
which produces:
Array
(
[0] => Array
(
[0] => iPhone: 52.973053,-0.021447
[1] => 52.973053
[2] => -0.021447
)
[1] => Array
(
[0] => FOO: -1.0,-1.0
[1] => -1.0
[2] => -1.0
)
)
Or just:
preg_match('/\w+:\s*(-?\d+\.\d+),(-?\d+\.\d+)/',
"iPhone: 52.973053,-0.021447",
$match);
print_r($match);
if the string only contains one coordinate.
A small explanation:
\w+ # match a word character: [a-zA-Z_0-9] and repeat it one or more times
: # match the character ':'
\s* # match a whitespace character: [ \t\n\x0B\f\r] and repeat it zero or more times
( # start capture group 1
-? # match the character '-' and match it once or none at all
\d+ # match a digit: [0-9] and repeat it one or more times
\. # match the character '.'
\d+ # match a digit: [0-9] and repeat it one or more times
) # end capture group 1
, # match the character ','
( # start capture group 2
-? # match the character '-' and match it once or none at all
\d+ # match a digit: [0-9] and repeat it one or more times
\. # match the character '.'
\d+ # match a digit: [0-9] and repeat it one or more times
) # end capture group 2

A solution without using regular expressions, using explode() and stripos() :) :
$string = "iPhone: 52.973053,-0.021447";
$coordinates = explode(',', $string);
// $coordinates[0] = "iPhone: 52.973053"
// $coordinates[1] = "-0.021447"
$coordinates[0] = trim(substr($coordinates[0], stripos($coordinates[0], ':') +1));
Assuming that the string always contains a colon.
Or if the identifier before the colon only contains characters (not numbers) you can do also this:
$string = "iPhone: 52.973053,-0.021447";
$string = trim($string, "a..zA..Z: ");
//$string = "52.973053,-0.021447"
$coordinates = explode(',', $string);

Try:
$string = "iPhone: 52.973053,-0.021447";
preg_match_all( "/-?\d+\.\d+/", $string, $result );
print_r( $result );

I like #Felix's non-regex solution, I think his solution for the problem is more clear and readable than using a regex.
Don't forget that you can use constants/variables to change the splitting by comma or colon if the original string format is changed.
Something like
define('COORDINATE_SEPARATOR',',');
define('DEVICE_AND_COORDINATES_SEPARATOR',':');

$str="iPhone: 52.973053,-0.021447";
$s = array_filter(preg_split("/[a-zA-Z:,]/",$str) );
print_r($s);

An even more simple solution is to use preg_split() with a much more simple regex, e.g.
$str = 'iPhone: 52.973053,-0.021447';
$parts = preg_split('/[ ,]/', $str);
print_r($parts);
which will give you
Array
(
[0] => iPhone:
[1] => 52.973053
[2] => -0.021447
)

Related

Regex of number inside brackets

I need to get the float number inside brackets..
I tried this '([0-9]*[.])?[0-9]+' but it returns the first number like 6 in the first example.
Also I tried this
'/\((\d+)\)/'
but it returns 0.
Please note that I need the extracted number either int or float.
Can u plz help
As you need to match bracket also, You need to add () in regular expression:
$str = 'Serving size 6 pieces (40)';
$str1 = 'Per bar (41.5)';
preg_match('#\(([0-9]*[.]?[0-9]+)\)#', $str, $matches);
print_r($matches);
preg_match('#\(([0-9]*[.]?[0-9]+)\)#', $str1, $matches);
print_r($matches);
Output:
Array
(
[0] => (40)
[1] => 40
)
Array
(
[0] => (41.5)
[1] => 41.5
)
DEMO
You could escape brackets:
$str = 'Serving size 6 pieces (41.5)';
if (preg_match('~\((\d+.?\d*)\)~', $str, $matches)) {
print_r($matches);
}
Outputs:
Array
(
[0] => (41.5)
[1] => 41.5
)
Regex:
\( # open bracket
( # capture group
\d+ # one or more numbers
.? # optional dot
\d* # optional numbers
) # end capture group
\) # close bracket
You could also use this to get only one digit after the dot:
'~\((\d+.?\d?)\)~'
You need to escape the brackets
preg_match('/\((\d+(?:\.\d+)?)\)/', $search, $matches);
explanation
\( escaped bracket to look for
( open subpattern
\d a number
+ one or more occurance of the character mentioned
( open Group
?: dont save data in a subpattern
\. escaped Point
\d a number
+ one or more occurance of the character mentioned
) close Group
? one or no occurance of the Group mentioned
) close subpattern
\) escaped closingbracket to look for
matches numbers like
1,
1.1,
11,
11.11,
111,
111.111 but NOT .1, .
https://regex101.com/r/ei7bIM/1
You could match an opening parenthesis, use \K to reset the starting point of the reported match and then match your value:
\(\K\d+(?:\.\d+)?(?=\))
That would match:
\( Match (
\K Reset the starting point of the reported match
\d+ Match one or more digits
(?: Non capturing group
\.\d+ Match a dot and one or more digits
)? Close non capturing group and make it optional
(?= Positive lookahead that asserts what follows is
\) Match )
) Close posive lookahead
Demo php

php special character trim

I have a set of strings like below :
fw-sophi.watcon.-.120
d-elain.heckop.-.121
sim.boosh.-.134
bh.-.elain.heckop.-.244
How would I trim the following set of strings to return only the middle section?
expected return:
sophi.watcon
elain.heckop
sim.boosh
elain.heckop
It isn't practical to trim() the strings manually as there are a lot, how would I do this Programmatically ?
If there are no other patterns, this should be sufficient:
$input = 'fw-sophi.watcon.-.120
d-elain.heckop.-.121
sim.boosh.-.134
bh.-.elain.heckop.-.244';
if (preg_match_all('~(?<key>\w+\.\w+)~', $input, $matches)) {
print_r($matches['key']);
}
Resulting array:
Array
(
[0] => sophi.watcon
[1] => elain.heckop
[2] => sim.boosh
[3] => elain.heckop
)
Pattern explanation:
~ #pattern start
( #start group to capture matched result
?<key> #give a name to group. see $matches['key'].
\w+ #one or more alphanumeric characters
\. #dot character. we need to escape it with \
\w+ #one or more alphanumeric characters
) #end group
~ #pattern end
x

PHP preg_match_all strings starting with C and length of 4

I'm attempting to search a text document with a PHP script and return all strings that start with the character "C" and have a length of 4. Some of the results will end in "=" but most will end in an alphanumeric character.
I was able to to successfully pull the ones that started with C and ended in =.
<?php
$str = file_get_contents('./FILENAME.txt', true);
preg_match_all('/C(.{2,3})=/', $str, $matches);
print_r($matches[0]);
foreach ($matches[0] as $sub)
{
$file = './CapturedData.txt';
file_put_contents($file, print_r("\"".$sub . "\"\n", true), FILE_APPEND);
}
?>
but when I tried to adjust it to pull all strings starting with C and ending in any alphanumeric character and being the length of 3/4, it just returns the first 3/4 characters out of ANY length strings.
I know I'm missing something simple, but its killing me. Anything I try just keeps returning the first x characters out of any length string. While I only want to return strings that have a length of 3/4, starting with "C" and ending in =,a-z,A-Z,0-9
EDIT:
Lets say these are the strings in the document:
blahblahblahCaa=
Cae=
CGGG
dontmatchthisCAAA
CAAAjkjkjk
XXXXXXCXXXX
I only want to return the 2nd and 3rd line
You must need to provide anchors to do an exact line match.
^C(?:\S{2,3})[a-zA-Z0-9=]$
DEMO
$input = <<<EOT
blahblahblahCaa=
Cae=
CGGG
dontmatchthisCAAA
CAAAjkjkjk
XXXXXXCXXXX
EOT;
preg_match_all("~^C(?:\S{2,3})[a-zA-Z0-9=]$~m", $input, $match);
print_r($match);
Output:
Array
(
[0] => Array
(
[0] => Cae=
[1] => CGGG
)
)
Regular Expression:
^ the beginning of the string
C 'C'
(?: group, but do not capture:
\S{2,3} non-whitespace (all but \n, \r, \t, \f,
and " ") (between 2 and 3 times)
) end of grouping
[a-zA-Z0-9=] any character of: 'a' to 'z', 'A' to 'Z',
'0' to '9', '='
$ before an optional \n, and the end of the
string
the comment above alludes to it -- your regex does not specify that the string has to end at the ending character, just that it has to contain it. So Caaaa matches, but so does Caaaabbb and so would Caaa=bbb. You don't say what your input format is, but if it's one word per line, you can match /^C(..|...)[a-zA-Z0-9=]$/m
From what i understands,
\bC\S{2,3}=?(?=\s)
Example : http://regex101.com/r/lR6kI3/3
C matches a C
\S{2,3} matches anything other than a space. {2,3} quantifies the regex 2 or 3 times
=? optional =
(?=\s) checks if the string is followed by space
Example usage
$re = "/\\bC\\S{2,3}=?(?=\\s)/m";
$str = "blahblahblahCaa= Cae= CGGG dontmatchthisCAAA CAAAjkjkjk XXXXXXCXXXX ";
preg_match_all($re, $str, $matches);
print_r($matches);
Will give an output as
Array ( [0] => Cae= [1] => CGGG )
try this:
$str = file_get_contents('FILENAME.txt', true);
// $str = "blahblahblahCaa=\nCae=\nCGGG\ndontmatchthisCAAA\nCAAAjkjkjk\nXXXXXXCXXXX";
preg_match_all('/^C[a-zA-Z0-9=]{3}$/m', $str, $matches);
var_dump($matches[0]);
foreach ($matches[0] as $sub)
{
$file = 'CapturedData.txt';
file_put_contents($file, $sub, FILE_APPEND);
}
live demo

Multiple Hash Tags removal

function getHashTagsFromString($str){
$matches = array();
$hashTag=array();
if (preg_match_all('/#([^\s]+)/', $str, $matches)) {
for($i=0;$i<sizeof($matches[1]);$i++){
$hashtag[$i]=$matches[1][$i];
}
return $hashtag;
}
}
test string $str = "STR
this is a string
with a #tag and
another #hello #hello2 ##hello3 one
STR";
using above function i am getting answers but not able to remove two # tags from ##hello3 how to remove that using single regular expression
Update your regular expression as follows:
/#+(\S+)/
Explanation:
/ - starting delimiter
#+ - match the literal # character one or more times
(\S+) - match (and capture) any non-space character (shorthand for [^\s])
/ - ending delimiter
Regex101 Demo
The output will be as follows:
Array
(
[0] => tag
[1] => hello
[2] => hello2
[3] => hello3
)
Demo
EDIT: To match all the hash tags use:
preg_match_all('/#\S+/', $str, $match);
To remove, instead of preg_match_all you should use preg_replace for replacement.
$repl = preg_replace('/#\S+/', '', $str);

Split string with regular expressions

I have this string:
EXAMPLE|abcd|[!PAGE|title]
I want to split it like this:
Array
(
[0] => EXAMPLE
[1] => abcd
[2] => [!PAGE|title]
)
How to do it?
Thank you.
DEMO
If you don't need anything more than you said, is like parsing a CSV but with | as separator and [ as " so: (\[.*?\]+|[^\|]+)(?=\||$) will do the work I think.
EDIT: Changed the regex, now it accepts strings like [asdf]].[]asf]
Explanation:
(\[.*?\]+|[^\|]+) -> This one is divided in 2 parts: (will match 1.1 or 1.2)
1.1 \[.*?\]+ -> Match everything between [ and ]
1.2 [^\|]+ -> Will match everything that is enclosed by |
(?=\||$) -> This will tell the regular expression that next to that must be a | or the end of the string so that will tell the regex to accept strings like the earlier example.
Given your example, you could use (\[.*?\]|[^|]+).
preg_match_all("#(\[.*?\]|[^|]+)#", "EXAMPLE|abcd|[!PAGE|title]", $matches);
print_r($matches[0]);
// output:
Array
(
[0] => EXAMPLE
[1] => abcd
[2] => [!PAGE|title]
)
use this regex (?<=\||^)(((\[.*\|?.*\])|(.+?)))(?=\||$)
(?<=\||^) Positive LookBehind
1st alternative: \|Literal `|`
2nd alternative: ^Start of string
1st Capturing group (((\[.*\|?.*\])|(.+?)))
2nd Capturing group ((\[.*\|?.*\])|(.+?))
1st alternative: (\[.*\|?.*\])
3rd Capturing group (\[.*\|?.*\])
\[ Literal `[`
. infinite to 0 times Any character (except newline)
\| 1 to 0 times Literal `|`
. infinite to 0 times Any character (except newline)
\] Literal `]`
2nd alternative: (.+?)
4th Capturing group (.+?)
. 1 to infinite times [lazy] Any character (except newline)
(?=\||$) Positive LookAhead
1st alternative: \|Literal `|`
2nd alternative: $End of string
g modifier: global. All matches (don't return on first match)
A Non-regex solution:
$str = str_replace('[', ']', "EXAMPLE|abcd|[!PAGE|title]");
$arr = str_getcsv ($str, '|', ']')
If you expect things like this "[[]]", you would've to escape the inside brackets with slashes in which case regex might be the better option.
http://de2.php.net/manual/en/function.explode.php
$array= explode('|', $string);

Categories