preg_match_all returning arrays - php

I recently made a small script to catch any URL's that pass through a textarea based on a form submit.
The regular expression im using is:
'/([\w]+).(local|test|stage|live).site.example.com/'
and if submit:
<p>body</p> <p>uk2.local.site.example.net
training.test.site.example.net</p>
<p>www.google.com</p>
<p>sd2.test.site.example.net</p>
i am returned with an array that contains:
0 => array(3
0 => uk2.local.site.example.net
1 => training.test.site.example.net
2 => sd2.test.site.example.net
)
1 => array(3
0 => local
1 => test
2 => test
)
I'm not sure why i get the second array and wanted to look to clean it up.

Use non-capture group, also escape the dots:
'/(\w+)\.(?:local|test|stage|live)\.site\.example\.com/'
// here __^^

Related

Is this normal behaviour in php arrays? Array size get shortened when using numbered indexes out of order

So I'm learning Php, so as I was messing around with arrays to see how they work, I stumbled into this when I made two arrays.
$TestArray1 = array( 1 => 1, "string" => "string", 24, "other", 2 => 6, 8);
$TestArray2 = array( 6 => 1, "string" => "string", 24, "other", 1 => 6, 8);
But when I print them out with print_r() this is what I get (this also happens with var_dump by the way)
Array ( [1] => 1 [string] => string [2] => 6 [3] => other [4] => 8 )
Array ( [6] => 1 [string] => string [7] => 24 [8] => other [1] => 6 [9] => 8 )
As far as I can tell, by putting the two in the second array it overwrites the next possible spot with no key and then keeps going, shortening the array. So I thought that meant that if I use a 1 it would put it at the start but that does not happen either.
Is this normal or is there something wrong with my php installation?
Im using Ampps in windows 10 with php 7.3.
Thanks in advance
Good question.
What's happening is that when determining automatic numeric indexes, PHP will look to the largest numeric index added and increment it (or use 0 if there are none).
The key is optional. If it is not specified, PHP will use the increment of the largest previously used integer key.
What's happening with your first array is that as it is evaluated left-to-right, 24 is inserted at index 2 because the last numeric index was 1 => 1.
Then when it gets to 2 => 6, it overwrites the previous value at index 2. This is why 24 is missing from your first array.
If multiple elements in the array declaration use the same key, only the last one will be used as all others are overwritten.
Here's a breakdown
$TestArray1 = [1 => 6]; // Array( [1] => 6 )
// no index, so use last numeric + 1
$TestArray1[] = 24; // Array( [1] => 6, [2] => 24 )
$TestArray1[2] = 6; // Array( [1] => 6, [2] => 6 )
When you manually add numeric indexes that are lower than previous ones (ie $TestArray2), they will be added as provided but their position will be later.
This is because PHP arrays are really maps that just pretend to be indexed arrays sometimes, depending on what's in them.
References are from the PHP manual page for Arrays

PHP Regex: How to get optional text if present?

Let's take an example of following string:
$string = "length:max(260):min(20)";
In the above string, :max(260):min(20) is optional. I want to get it if it is present otherwise only length should be returned.
I have following regex but it doesn't work:
/(.*?)(?::(.*?))?/se
It doesn't return anything in the array when I use preg_match function.
Remember, there can be something else than above string. Maybe like this:
$string = "number:disallow(negative)";
Is there any problem in my regex or PHP won't return anything? Dumping preg_match returns int 1 which means the string matches the regex.
Fully Dumped:
int 1
array (size=2)
0 => string '' (length=0)
1 => string '' (length=0)
You're using single character (.) matching in the case of being lazy, at the very beginning. So it stops at the zero position. If you change your preg_match function to preg_match_all you'll see the captured groups.
Another problem is with your Regular Expression. You're killing the engine. Also e modifier is deprecated many many decades before!!! and yet it was used in preg_replace function only.
Don't use s modifier too! That's not needed.
This works at your case:
/([^:]+)(:.*)?/
Online demo
I tried to prepare a regex which can probably solve your issue and also add some value to it
this regex will not only match the optional elements but will also capture in key value pair
Regex
/(?<=:|)(?'prop'\w+)(?:\((?'val'.+?)\))?/g
Test string
length:max(260):min(20)
length
number:disallow(negative)
Result
MATCH 1
prop [0-6] length
MATCH 2
prop [7-10] max
val [11-14] 260
MATCH 3
prop [16-19] min
val [20-22] 20
MATCH 4
prop [24-30] length
MATCH 5
prop [31-37] number
MATCH 6
prop [38-46] disallow
val [47-55] negative
try demo here
EDIT
I think I understand what you meant by duplicate array with different key, it was due to named captures eg. prop & val
here is the revision without named capturing
Regex
/(?<=:|)(\w+)(?:\((.+?)\))?/
Sample code
$str = "length:max(260):min(20)";
$str .= "\nlength";
$str .= "\nnumber:disallow(negative)";
preg_match_all("/(?<=:|)(\w+)(?:\((.+?)\))?/",
$str,
$matches);
print_r($matches);
Result
Array
(
[0] => Array
(
[0] => length
[1] => max(260)
[2] => min(20)
[3] => length
[4] => number
[5] => disallow(negative)
)
[1] => Array
(
[0] => length
[1] => max
[2] => min
[3] => length
[4] => number
[5] => disallow
)
[2] => Array
(
[0] =>
[1] => 260
[2] => 20
[3] =>
[4] =>
[5] => negative
)
)
try demo here

Searching through multiple fields in SphinxSearch (PHP)

I have the following code:
$this->api = new App_Other_SphinxSearch();
$this->api->SetServer($host, $port);
$this->api->SetConnectTimeout(1);
$this->api->SetArrayResult(true);
$results = $this->api->Query("#(title,content) test", 'members');
echo "<pre>";print_r($results);die;
According to their documentation, a syntax like #(field_1,field_2) query should return docs which match the string query in either field_1 or field_2.
The PHP SDK returns something entirely different:
Array
(
[error] =>
[warning] =>
[status] => 0
[fields] => Array
(
[0] => title
[1] => content
)
[attrs] => Array
(
[created] => 2
[content] => 7
)
[total] => 0
[total_found] => 0
[time] => 0.000
[words] => Array
(
[title] => Array
(
[docs] => 10
[hits] => 34
)
[content] => Array
(
[docs] => 34
[hits] => 139
)
[test] => Array
(
[docs] => 26
[hits] => 34
)
)
)
There is no matches key in the array, however it got some hits. I don't really understand why this is happening, especially because if I try the same query from the command line, everything works correctly.
Any help?
Edit: Querying like this works though: #* test. Not really what I want though, cause that searches in all fields.
[total_found] => 0 says there where no matches.
The words array, just tells you now many documents and how many times that word appears in ANY (and all) fields. (WITHOUT regard to your specific query)
Sphinx caches those word stats, which help it make quick sanity checks on queries. When the query runs it can end up not matching any documents (because only then does it apply the field level filters), even though the individual words are found.
That explains your misinpretation of the results, but not why getting the results you get.
You are entering a Extended Mode query (as evidenced by the link to the sphinx documentation) , but the sphinx API defaults to ALL query mode. Notice also that title and content are in the words array, so are being taken as plain keywords not as syntax.
So you need to include a
$this->api->SetMatchMode(SPH_MATCH_EXTENDED);
btw, #* test, works because the #* are simply ignored in ALL query mode.

Regex skipping value

Greetings All
I am trying to get the values in the 4th column from the left for this url. I can get all the values but it skips the first one (e.g. 30 i think is the value on top right now )
My regex is
~<td align="center" class="row2">.*([\d,]+).*</td>~isU
NOTE: HTML PARSING IS NOT AN OPTION RIGHT NOW AS THIS IS PART OF A HUGE SYSTEM AND CANNOT
BE CHANGED
Thanking you
Imran
You could just use:
/([\d,]+)/
As the javascript function can be exploited as a "regex selection point"
If you want your regex to work you need to use non-greedy expression, i.e. change .* to .*?
Also your first align match attribute in the HTML is surrounded in '' quotation marks, not "" in the HTML, for some weird inconsistent reason. Try this:
|<td align=["\']center["\'] class="row2">.*?([\d,]+).*?</td>|is
Edit:
$a = file_get_contents('http://www.zajilnet.com/forum/index.php?showforum=31');
preg_match_all('|<td align=["\']center["\'] class="row2">.*?([\d,]+).*?</td>|is',$a,$m);
print_r($m[1]);
Result:
Array
(
[0] => 30
[1] => 16
[2] => 56
[3] => 14
[4] => 96
[5] => 4
[6] => 0
[7] => 17
[.... and more....]

Get the real difference between two arrays in php

I'm trying to get the difference between two arrays, but with array_diff, array_diff_assoc, or array_diff_key I can't get what I want..
Array 1 :
0 => 424012,
1 => 423000,
2 => 425010,
3 => 431447,
4 => 421001,
5 => 421002,
Array 2 :
0 => 424012,
1 => 423000,
2 => 425010,
3 => 431447,
4 => 431447,
5 => 421001,
6 => 421002,
array_diff = array ()
// empty
jarray_diff_assoc = array (
4 => 431447,
5 => 421001,
6 => 421002,
)
// OK but too much :)
array_diff_key = array(
6 => 421002
)
// nope i don't want that :(
I want 431447, cause it's only one time in the first array and twice in the second.
Regards, Tony
Is that exactly what you want? Only those that occur one time in the first, and two times in the second?
You can basically write your own function for that. Search through the second array, get a list of values that occur two times (or more than once, depending on what it is that you actually want), and then search for those in the first one (this you can do using a built-in PHP function array_intersect).

Categories