Splitting string into sections while maintaining all non-word characters - php

I'm working on an encryption function just for fun (for a non-production environment). Currently running my encrypt function like this:
encrypt("This is a string.");
Produces the following string:
GnulHynkAfdsGknp AfdsGknp Wgbf GknpLnugBuipAfdsCbhgByfg.
This is perfect, exactly what I wanted and expected - however, now I'm trying to write a decrypt function. Every character that is encrypted will have a single capital letter followed by 3 non-capital letters (As you can see from the example above).
My plan was to run preg_split() to get the different letters of the string.
Here is my current PHP code (pattern ([A-Z][a-z]{3})):
print_r(preg_split("/([A-Z][a-z]{3})/", $string));
There are a couple of problems with this. While testing, I discovered that it is not returning what I expected, the return is:
Array
(
[0] =>
[1] =>
[2] =>
[3] =>
[4] =>
[5] =>
[6] =>
[7] =>
[8] =>
[9] =>
[10] =>
[11] =>
[12] =>
[13] => .
)
(Via eval.in)
So this has the proper amount of returns, but they are all blank. Why are all the values blank?
Another thing that I thought of was that I needed to include other characters such as spaces, commas, periods etc in the preg_split() return. In the return I got from eval.in, it appears as though the final period has been included. Is this true for spaces and other characters as well, or do I need to do something special in cases of these characters?

It's "splitting" on those matches so they are removed. You want preg_match_all or use PREG_SPLIT_DELIM_CAPTURE with PREG_SPLIT_NO_EMPTY.
print_r(preg_split("/([A-Z][a-z]{3})/",
$string,
null,
PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY));

You should remove capturing group () and use preg_match_all.
$text = "GnulHynkAfdsGknp AfdsGknp Wgbf GknpLnugBuipAfdsCbhgByfg.";
preg_match_all("/[A-Z][a-z]{3}|(?: |,|\.)/", $text, $match);
print_r($match);
Output:
Array
(
[0] => Array
(
[0] => Gnul
[1] => Hynk
[2] => Afds
[3] => Gknp
[4] =>
[5] => Afds
[6] => Gknp
[7] =>
[8] => Wgbf
[9] =>
[10] => Gknp
[11] => Lnug
[12] => Buip
[13] => Afds
[14] => Cbhg
[15] => Byfg
[16] => .
)
)

Related

How to split 1 array into 2 arrays, remove certain items, and combine them again into 1 array in PHP?

i want to create something using array. I have 1 array and i need to split it into 2 array. After that search specific items from both array and remove it then combine it 2 array into 1 array.
How do i do that?
I already try to use unset for array but confuse how to use it for specific key since my array data format like 16/2/1/1 and 16/2/1/5. I need to remove data which have 1.
My format array is like this
Array
(
[1] => Array
(
[0] => 16/2/1/1 --> remove this have 1 after 2
[1] => 16/2/0/2
[2] => 16/2/0/3
[3] => 16/2/0/4
[4] => 16/2/0/5
[5] => 16/2/0/6
[6] => 16/2/0/7
[7] => 16/2/0/8
[8] => 16/2/0/9
[9] => 16/2/0/10
[10] => 16/2/0/11
[11] => 16/2/0/12
[12] => 16/2/0/13
[13] => 16/2/0/14
[14] => 16/2/0/15
[15] => 16/2/0/16
)
[2] => Array
(
[0] => 16/2/0/1
[1] => 16/2/0/2
[2] => 16/2/0/3
[3] => 16/2/0/4
[4] => 16/2/1/5 --> and this have 1 after 2 before 5
[5] => 16/2/0/6
[6] => 16/2/0/7
[7] => 16/2/0/8
[8] => 16/2/0/9
[9] => 16/2/0/10
[10] => 16/2/0/11
[11] => 16/2/0/12
[12] => 16/2/0/13
[13] => 16/2/0/14
[14] => 16/2/0/15
[15] => 16/2/0/16
)
)
i expect the output something like (after combine)
Array
(
[0] => 16/2/0/2
[1] => 16/2/0/3
[2] => 16/2/0/4
[3] => 16/2/0/6
[4] => 16/2/0/7
[5] => 16/2/0/8
[6] => 16/2/0/9
[7] => 16/2/0/10
[8] => 16/2/0/11
[9] => 16/2/0/12
[10] => 16/2/0/13
[11] => 16/2/0/14
[12] => 16/2/0/15
[13] => 16/2/0/16
)
Thanks for time to help me.
Make the array unique and then extract items that are digits/digits/NOT 1/digits:
$array = preg_grep('#^\d+/\d+/[^1]/\d+#', array_unique($array));
I would use preg_grep which allows you to search an array using a Regular expression.
$array =[
'16/2/0/13',
'16/2/0/16',
'16/2/1/5'
];
$array = preg_grep('~^16/2/0/\d+$~', $array);
print_r($array);
Output
Array
(
[0] => 16/2/0/13
[1] => 16/2/0/16
)
Sandbox
The Regex
^ match start of string
16/2/0/ - match literally (at the start of string, see above)
\d+ any digit one or more
$ match end of string
So Regular expressions is a way to do pattern matching, in this case the pattern is 16/2/0/{n} where {n} is any number. So by doing this we can find only those items that match that pattern.
Then if you have duplicates, you can do array_unique() and easily remove those.
There are many ways to do this array_filter with a custom callback etc. But this is the most straightforward way (if you know Regex).

preg_match to show lines containing one string and one of the other two

I've got an array in php:
Array
(
[0] => sth!Man!Tree!null
[1] => sth!Maning!AppTree!null
[2] => sth!Man!Lake!null
[3] => sth!Man!Tree!null
[4] => sth!Man!AppTree!null
[5] => sth!Maning!AppTree!null
[6] => sth!Man!Tree!null
[7] => sth!Maning!AppTree!null
[8] => sth!Maning!AppTree!null
[9] => sth!Man!Tree!null
[10] => sth!Man!Tree!null
[11] => sth!Man!Tree!null
[12] => sth!Man!Tree!null
[12] => sth!Man!Lake!null
[13] => sth!Maning!Tree!null
)
and this preg_match function:
preg_match("/Man/i", $line) && (preg_match("/!Tree!/i", $line) || preg_match("/!Lake!/i", $line))
My goal is to change it to one preg_match regex function to display only lines with Man and Tree or Man and Lake. Is it possible?
You can use the following regex:
(?i)\b(?:Lake|Tree)\b.*\bMan\b|\bMan\b.*\b(?:Tree|Lake)\b
See demo.
The word boundaries match only the whole words, (?i) inline mode option enables case-insensitive search, and we need at least two main alternatives to account for different positions of Man and Lake/Tree.
Sample code:
$re = "/(?i)\\b(?:Lake|Tree)\\b.*\\bMan\\b|\\bMan\\b.*\\b(?:Tree|Lake)\\b/";
$str = " Man and Tree or Man and Lake. Is it possible?";
preg_match($re, $str, $matches);
preg_match("/Man!(?:Tree|Lake)/i", $line, $matches) should do it most efficiently.

preg_match_all for this pattern

i have this pattern and i ant to use it to extract the numbers after the /image/ field and i have tried this pattern and i have checked online at http://www.functions-online.com/preg_match_all.html and it is giving desired output for the first link but for other links it is not giving desired output
here is my pattern
/\sample.com\/image\/(.*)\//
and here is my string
Mario Ermito photos by sample.com Mario Ermito Latest News, Photos, Biography, Videos and Wallpapers [img]http://xyz.sample.com/image/4205476/600full-mario-ermito.jpg[/img][img]http://xyz.sample.com/image/4453948/600full-my-profile.jpg[/img][img]http://xyz.sample.com/image/427185/600full-eagle-eye-poster.jpg[/img][img]http://xyz.sample.com/image/1323868/600full-alexis-bledel.jpg[/img][img]http://xyz.sample.com/image/2505314/600full-monroe-lee.jpg[/img][img]http://xyz.sample.com/image/3300481/600full-cindy-crawford.jpg[/img][img]http://xyz.sample.com/image/1046646/600full-pitura-freska.jpg[/img][img]http://xyz.sample.com/image/4322305/600full-kristin-kreuk.jpg[/img][img]http://xyz.sample.com/image/4261476/600full-kang-so--ra.jpg[/img][img]http://xyz.sample.com/image/3386911/600full-summer-brielle.jpg[/img][img]http://xyz.sample.com/image/4663949/600full-the-closer-artwork.jpg[/img]
eg
i want to extract only number after /image/ field i dont want image name my desired output is
4205476
4453948
427185
etc all numbers from string
Use this Regular Expression ~\/\image\/(.*?)\/~
<?php
$str=' Mario Ermito photos by sample.com Mario Ermito Latest News, Photos, Biography, Videos and Wallpapers [img]http://xyz.sample.com/image/4205476/600full-mario-ermito.jpg[/img][img]http://xyz.sample.com/image/4453948/600full-my-profile.jpg[/img][img]http://xyz.sample.com/image/427185/600full-eagle-eye-poster.jpg[/img][img]http://xyz.sample.com/image/1323868/600full-alexis-bledel.jpg[/img][img]http://xyz.sample.com/image/2505314/600full-monroe-lee.jpg[/img][img]http://xyz.sample.com/image/3300481/600full-cindy-crawford.jpg[/img][img]http://xyz.sample.com/image/1046646/600full-pitura-freska.jpg[/img][img]http://xyz.sample.com/image/4322305/600full-kristin-kreuk.jpg[/img][img]http://xyz.sample.com/image/4261476/600full-kang-so--ra.jpg[/img][img]http://xyz.sample.com/image/3386911/600full-summer-brielle.jpg[/img][img]http://xyz.sample.com/image/4663949/600full-the-closer-artwork.jpg[/img]';
preg_match_all('~\/\image\/(.*?)\/~', $str, $matches);
print_r($matches[1]);
OUTPUT :
Array
(
[0] => 4205476
[1] => 4453948
[2] => 427185
[3] => 1323868
[4] => 2505314
[5] => 3300481
[6] => 1046646
[7] => 4322305
[8] => 4261476
[9] => 3386911
[10] => 4663949
)
You need to adjust your regular expression:
$regex = '#sample\.com/image/([0-9]+)/#'
preg_match_all('#sample\.com/image/([0-9]+)/#', $str, $m);
print_r($m);
Expected output:
Array
(
[0] => Array
(
[0] => sample.com/image/4205476/
[1] => sample.com/image/4453948/
[2] => sample.com/image/427185/
[3] => sample.com/image/1323868/
[4] => sample.com/image/2505314/
[5] => sample.com/image/3300481/
[6] => sample.com/image/1046646/
[7] => sample.com/image/4322305/
[8] => sample.com/image/4261476/
[9] => sample.com/image/3386911/
[10] => sample.com/image/4663949/
)
[1] => Array
(
[0] => 4205476
[1] => 4453948
[2] => 427185
[3] => 1323868
[4] => 2505314
[5] => 3300481
[6] => 1046646
[7] => 4322305
[8] => 4261476
[9] => 3386911
[10] => 4663949
)
)
Now you'll need to keep in mind that PHP will return everything it matches including the undesired parts of the regex string.
From the PHP Manual:
http://www.php.net/manual/en/function.preg-match-all.php
Orders results so that $matches[0] is an array of full pattern
matches, $matches[1] is an array of strings matched by the first
parenthesized subpattern, and so on.
Try this:
/.*sample\.com\/image\/(\d+)\/.*/
Debuggex Demo

preg_split with regex giving incorrect output

I'm using preg_split to an string, but I'm not getting desired output. For example
$string = 'Tachycardia limit_from:1900-01-01 limit_to:2027-08-29 numresults:10 sort:publication-date direction:descending facet-on-toc-section-id:Case Reports';
$vals = preg_split("/(\w*\d?):/", $string, NULL, PREG_SPLIT_DELIM_CAPTURE);
is generating output
Array
(
[0] => Tachycardia
[1] => limit_from
[2] => 1900-01-01
[3] => limit_to
[4] => 2027-08-29
[5] => numresults
[6] => 10
[7] => sort
[8] => publication-date
[9] => direction
[10] => descending facet-on-toc-section-
[11] => id
[12] => Case Reports
)
Which is wrong, desire output it
Array
(
[0] => Tachycardia
[1] => limit_from
[2] => 1900-01-01
[3] => limit_to
[4] => 2027-08-29
[5] => numresults
[6] => 10
[7] => sort
[8] => publication-date
[9] => direction
[10] => descending
[11] => facet-on-toc-section-id
[12] => Case Reports
)
There something wrong with regex, but I'm not able to fix it.
I would use
$vals = preg_split("/(\S+):/", $string, NULL, PREG_SPLIT_DELIM_CAPTURE);
Output is exactly like you want
It's because the \w class does not include the character -, so i would expand the \w with that too:
/((?:\w|-)*\d?):/
Try this regex instead to include '-' or other characters in your splitting pattern: http://regexr.com?32qgs
((?:[\w\-])*\d?):

preg_match to match an optional string, but not match all of the string

Take for example the following regex match.
preg_match('!^publisher/([A-Za-z0-9\-\_]+)/([0-9]+)/([0-9]{4})-(january|february|march|april|may|june|july|august|september|october|november|december):([0-9]{1,2})-([0-9]{1,2})/([A-Za-z0-9\-\_]+)/([0-9]+)(/page-[0-9]+)?$!', 'publisher/news/1/2010-march:03-23/test_title/1/page-1', $matches);
print_r($matches);
It produces the following:
Array
(
[0] => publisher/news/1/2010-march:03-23/test_title/1/page-1
[1] => news
[2] => 1
[3] => 2010
[4] => march
[5] => 03
[6] => 23
[7] => test_title
[8] => 1
[9] => /page-1
)
However as the last match is optional it can also work with matching the following "publisher/news/1/2010-march:03-23/test_title/1". My problem is that I want to be able to match (/page-[0-9]+) if it exists, but match only the page number so "publisher/news/1/2010-march:03-23/test_title/1/page-1" would match like so:
Array
(
[0] => publisher/news/1/2010-march:03-23/test_title/1/page-1
[1] => news
[2] => 1
[3] => 2010
[4] => march
[5] => 03
[6] => 23
[7] => test_title
[8] => 1
[9] => 1
)
I've tried the following regex
'!^publisher/([A-Za-z0-9\-\_]+)/([0-9]+)/([0-9]{4})-(january|february|march|april|may|june|july|august|september|october|november|december):([0-9]{1,2})-([0-9]{1,2})/([A-Za-z0-9\-\_]+)/([0-9]+)/?p?a?g?e?-?([0-9]+)?$!'
This works, however it will also match "publisher/news/1/2010-march:03-23/test_title/1/1". I have no idea to perform a match but not have it come back in the matches? Is it possible in a single regex?
To absolutely not match publisher/news/1/2010-march:03-23/test_title/1/whatever
!^publisher/([A-Za-z0-9\-\_]+)/([0-9]+)/([0-9]{4})-(january|february|march|april|may|june|july|august|september|october|november|december):([0-9]{1,2})-([0-9]{1,2})/([A-Za-z0-9\-\_]+)/([0-9]+)(?:/page-([0-9]+))?$!
To still match publisher/news/1/2010-march:03-23/test_title/1/whatever but ignore the /whatever:
!^publisher/([A-Za-z0-9\-\_]+)/([0-9]+)/([0-9]{4})-(january|february|march|april|may|june|july|august|september|october|november|december):([0-9]{1,2})-([0-9]{1,2})/([A-Za-z0-9\-\_]+)/([0-9]+)(?:(?:/page-([0-9]+))|/.*)?$!
maybe like that:
'!^publisher/([A-Za-z0-9\-\_]+)/([0-9]+)/([0-9]{4})-(january|february|march|april|may|june|july|august|september|october|november|december):([0-9]{1,2})-([0-9]{1,2})/([A-Za-z0-9\-\_]+)/([0-9]+)(/page-([0-9]+))?$!'
This is the regex what you are looking for:
^publisher/([A-Za-z0-9\-\_]+)/([0-9]+)/([0-9]{4})-(january|february|march|april|may|june|july|august|september|october|november|december):([0-9]{1,2})-([0-9]{1,2})/([A-Za-z0-9\-\_]+)/([0-9]+)/(?:page-(\d+))?
You can test it in rexexbuddy. If "page-1" is not set it will leave var 9 empty else it will set it.

Categories