When i perform a regular expression
preg_match_all('~(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?)~', $content, $turls);
print_r($turls);
i got an array inside array. I need a single array only.
How to negotiate the arrays inside another arrays
By default preg_match_all() uses PREG_PATTERN_ORDER flag, which means:
Orders results so that $matches[0] is
an array of full pattern matches,
$matches1 is an array of strings
matched by the first parenthesized
subpattern, and so on.
See http://php.net/preg_match_all
Here is sample output:
array(
0 => array( // Full pattern matches
0 => 'http://www.w3.org/TR/html4/strict.dtd',
1 => ...
),
1 => array( // First parenthesized subpattern.
// In your case it is the same as full pattern, because first
// parenthesized subpattern includes all pattern :-)
0 => 'http://www.w3.org/TR/html4/strict.dtd',
1 => ...
),
2 => array( // Second parenthesized subpattern.
0 => 'www.w3.org',
1 => ...
),
...
)
So, as R. Hill answered, you need $matches[0] to access all matched urls.
And as budinov.com pointed, you should remove outer parentheses to avoid second match duplicate first one, e.g.:
preg_match_all('~https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?~', $content, $turls);
// where $turls[0] is what you need
Not sure what you mean by 'negociate'. If you mean fetch the inner array, that should work:
$urls = preg_match_all('~(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?)~', $content, $matches) ? $matches[0] : array();
if ( count($urls) ) {
...
}
Generally you can replace your regexp with one that doesn't contain parenthesis (). This way your results will be hold just in the $turls[0] variable :
preg_match_all('/https?\:\/\/[^\"\'\s]+/i', file_get_contents('http://www.yahoo.com'), $turls);
and then do some code to make urls unique like this:
$result = array_keys(array_flip($turls[0]));
Related
I have an input string like this:
"Day":June 8-10-2012,"Location":US,"City":Newyork
I need to match 3 value substrings:
June 8-10-2012
US
Newyork
I don't need the labels.
Per my comment above, if this is JSON, you should definitely use those functions as they are more suited for this.
However, you can use the following REGEX.
/:([a-zA-Z0-9\s-]*)/g
<?php
preg_match('/:([a-zA-Z0-9\s-]*)/', '"Day":June 8-10-2012,"Location":US,"City":Newyork', $matches);
print_r($matches);
The regex demo is here:
https://regex101.com/r/BbwVQ5/1
Here are a couple of simple ways:
Code: (Demo)
$string = '"Day":June 8-10-2012,"Location":US,"City":Newyork';
var_export(preg_match_all('/:\K[^,]+/', $string, $out) ? $out[0] : 'fail');
echo "\n\n";
var_export(preg_split('/,?"[^"]+":/', $string, 0, PREG_SPLIT_NO_EMPTY));
Output:
array (
0 => 'June 8-10-2012',
1 => 'US',
2 => 'Newyork',
)
array (
0 => 'June 8-10-2012',
1 => 'US',
2 => 'Newyork',
)
Pattern #1 Demo \K restarts the match after : so that a positive lookbehind can be avoided (saving "steps" / improving pattern efficiency) By matching all following characters that are not a comma, a capture group can be avoided (saving "steps" / improving pattern efficiency).
Patter #2 Demo ,? makes the comma optional and qualifies the leading double-quoted "key" to be matched (split on). The targeted substring to split on will match the full "key" substring and end on the following : colon.
How can I get values from string and put it into an associative array, where the key must be given wildcard string.
Given template is:
param1/prefix-{wildcard1}/{wildcard2}/param2
Given string is:
param1/prefix-name/lastname/param2
The result must be
array('wildcard1' => 'name', 'wildcard2' => 'lastname');
UPD
I want to implement some route script, and wildcards must be variable names that will be injected to script and they will be loaded dynamically from other classes.
I'd first transform template into regex with named capture group, then do the preg_match.
$template = 'param1/prefix-{wildcard1}/{wildcard2}/param2';
// escape special characters
$template = preg_quote($template, '/');
// first, change all {wildcardN} into (?<wildcardN.*?)
$regex = preg_replace('/\\\{([^}]+)\\\}/', "(?<$1>.*?)", $template);
$string = 'param1/prefix-name/lastname/param2';
// do the preg_match using the regex
preg_match("/$regex/", $string, $match);
print_r($match);
Output:
Array
(
[0] => param1/prefix-name/lastname/param2
[wildcard1] => name
[1] => name
[wildcard2] => lastname
[2] => lastname
)
If you make an array and fill it with values with the function preg_match_all , how do you retrieve each separate value from the array (basically scan through it from index 0 to length)? I want to take each value and perform another function on it, so how would I return each value stored in the array? (I want arr's value at index 0, at index 1, etc.)
$contents = file_get_contents('words.txt');
$arr = array();
preg_match_all($pattern, $contents, $arr); //finds all matches
$curr = current($arr);
I tried doing this (my pattern was written elsewhere) and echoing it afterwards, but I keep getting the string "Array".
$contents=file_get_contents('words.txt');
$arr=array();
preg_match_all($pattern,$contents,$arr);//finds all matches
foreach ($arr as $item) {
$curr=current($arr);
// do something
}
preg_match_all() takes at least Parameters: (1) The Regex. (2) The String to Match your Regex against. It also takes a third Optional Parameter which gets automatically populated with all the found matches. This 3rd Argument is a Numerically indexed Array. Thus you can loop through it like like you would a Normal Array as well as treat it like any normal Array (which it is). The Snippet below demonstrate this using your Code:
<?php
$contents = file_get_contents('words.txt');
//$arr = array(); //<== NO NEED TO PRE-DECLARE THIS HERE
preg_match_all($pattern, $contents, $arr);//finds all matches
// IF YOU JUST WANT THE 1ST (CURRENT MATCH), SIMPLY USE current:
$curr = current($arr);
// YOU MAY KEEP MOVING THE CURSOR TO THE NEXT ITEM LIKE SO:
$next1 = next($arr);
$next2 = next($arr);
// OR JUST LOOP THROUGH THE FOUND MATCHES: $arr
foreach($arr as $match){
$curr = current($arr); //<== BUT THIS IS SOMEWHAT REDUNDANT WITHIN THIS LOOP.
// THE VARIABLE $match CONTAINS THE MATCHED STRING WITHIN THE CURRENT ITERATION:
var_dump($match); //<== RETURNS THE CURRENT VALUE WITHIN ITERATION
}
// YOU MAY EVEN USE NUMERIC INDEXES TO ACCESS THEM LIKE SO.
$arrLen = count($arr);
$elem0 = isset($arr[0])?$arr[0]:null;
$elem1 = isset($arr[1])?$arr[1]:null;
$elem2 = isset($arr[2])?$arr[2]:null;
$elem3 = isset($arr[3])?$arr[3]:null; //<== AND SO ON...
If I understand well your question, you want to know how looks the results stored in the third parameter of preg_match_all.
preg_match_all stores the results in the third parameter as a 2-dimensional array.
Two structures are possible and can be explicitly set with two constants in the fourth parameter.
Let's say you have a pattern with two capture groups and your subject string contains 3 occurrences of your pattern:
1) PREG_PATTERN_ORDER is the default setting that returns something like that:
[
0 => [ 0 => 'whole match 1',
1 => 'whole match 2',
2 => 'whole match 3' ],
1 => [ 0 => 'capture group 1 from match 1',
1 => 'capture group 1 from match 2',
2 => 'capture group 1 from match 3' ],
2 => [ 0 => 'capture group 2 from match 1',
1 => 'capture group 2 from match 2',
2 => 'capture group 2 from match 3' ]
]
2) PREG_SET_ORDER that returns something like that:
[
0 => [ 0 => 'whole match 1',
1 => 'capture group 1 from match 1',
2 => 'capture group 2 from match 1' ],
1 => [ 0 => 'whole match 2',
1 => 'capture group 1 from match 2',
2 => 'capture group 2 from match 2' ],
2 => [ 0 => 'whole match 3',
1 => 'capture group 1 from match 3',
2 => 'capture group 2 from match 3' ]
]
Concrete examples can be found in the PHP manual. You only need to choose which is the more convenient option for what you need to do.
So, to apply your function, all you need to do is to perform a foreach loop or to use array_map (that is a hidden loop too, but shorter and slower).
In general, if you aren't sure what is the structure of a variable, you can use print_r or var_dump to know how it looks like.
$string = '/start info#example.com';
$pattern = '/{command} {name}#{domain}';
get array params in php, Like the example below:
['command' => 'start', 'name' => 'info', 'domain' => 'example.com']
and
$string = '/start info#example.com';
$pattern = '/{command} {email}';
['command' => 'start', 'email' => 'info#example.com']
and
$string = '/start info#example.com';
$pattern = '{command} {email}';
['command' => '/start', 'email' => 'info#example.com']
If its a single line string you can use preg_match and a regular expression such as this
preg_match('/^\/(?P<command>\w+)\s(?P<name>[^#]+)\#(?P<domain>.+?)$/', '/start info#example.com', $match );
But depending on variation in the data you may have to adjust the regx a bit. This outputs
command [1-6] start
name [7-11] info
domain [12-23] example.com
but it will also have the numeric index in the array.
https://regex101.com/r/jN8gP7/1
Just to break this down a bit, in English.
The leading ^ is start of line, then named capture ( \w (any a-z A-Z 0-9 _ ) ) then a space \s then named capture of ( anything but the #t sign [^#] ), then the #t sign #, then name captured of ( anything .+? to the end $ )
This will capture anything in this format,
(abc123_ ) space (anything but #)#(anything)
Given the following code:
$regex = '/(http\:\/\/|https\:\/\/)([a-z0-9-\.\/\?\=\+_]*)/i';
$text = preg_split($regex, $note, -1, PREG_SPLIT_DELIM_CAPTURE);
its returning an array such as:
array (size=4)
0 => string '...' (length=X)
1 => string 'https://' (length=8)
2 => string 'duckduckgo.com/?q=how+much+wood+could+a+wood-chuck+chuck+if+a+wood-chuck+could+chuck+wood' (length=89)
3 => string '...' (length=X)
I would prefer it if the returned array had size=3, with one single URL. Is this possible?
Sure that can be done, just remove those extra matching groups from your regex. Try following code:
$regex = '#(https?://[a-z0-9.?=+_-]*)#i';
$text = preg_split($regex, $note, -1, PREG_SPLIT_DELIM_CAPTURE);
Now resulting array will have 3 elements in the array instead of 4.
Besides removing extra grouping I have also simplified your regex also since most of the special characters don't need to be escaped inside character class.