How to parse a file in PHP - php

I have a string with the following content
[text1] some content [text2] some
content some content [text3] some
content
The "[textn]" are finite and also have specific names. I want to get the content into an array. Any idea?

If you don't wanna use regular expressions, then strtok() would do the trick here:
strtok($txt, "["); // search for first [
while ($id = strtok("]")) { // alternate ] and [
$result[$id] = strtok("["); // add token
}

In php there are function for splitting the string with regexp delimiters, like preg_match, preg_match_all, look them up.
If you have a word list, you can split the string like this (obviously, one could write it much nicer):
$words = array('[text1]','[text2]','[text3]');
$str = "[text1] some content [text2] some content some content [text3] some content3";
for ($i=0; $i<sizeof($words) ; $i++) {
$olddel = $del;
$del = $words[$i];
list($match,$str) = explode($del,$str);
if ($i-1 >= 0) { $matches[$i-1] = $olddel.' '.$match; }
}
$matches[] =$del." ".$str;
print_r($matches);
This will output: Array ( [0] => [text1] some content [1] => [text2] some content some content [2] => [text3] some content3 )

preg_match or preg_match_all, you need to give us an example if you want regex.
$string = "[text1] some content [text2] some content some content [text3] some content";
preg_match_all("#\[([^\[\]]+)\]#is", $string, $matches);
print_r($matches); //Array ( [0] => Array ( [0] => [text1] [1] => [text2] [2] => [text3] ) [1] => Array ( [0] => text1 [1] => text2 [2] => text3 ) )
Non-recursive.

Is [ and ] part of the string or did you just use them to highlight the part that you want to extract? If it is not, then you can use
if (preg_match_all("/\b(text1|text2|text3|foo|bar)\b/i", $string, $matches)) {
print_r($matches);
}

Related

PHP explode String with Data Pairs

I have a dataset and I want to conevert it into an array and I just can't figure out how...
I've tried a couple things like preg_replace() with regex and explode() but it doesn't come out the way I need it.
So my dataset looks like this:
dataCrossID=12345, DeviceID=[ID=1234567]
dataCrossID=5678, DeviceID=[ID=7654321]
dataCrossID=67899, DeviceID=[ID=87654321]
and the Array should look like this:
$dataSet(
[12345] => 1234567,
[5678] => 7654321,
[67899] => 87654321,
)
I tried regex but the fact that the numbers got different lenghts makes it hard for me.
Does anyone have an idea?
The easiest way would be using preg_match_all with an simple regular expression.
$data = 'dataCrossID=12345, DeviceID=[ID=1234567]
dataCrossID=5678, DeviceID=[ID=7654321]
dataCrossID=67899, DeviceID=[ID=87654321]';
preg_match_all('/=([0-9]+).*=([0-9]+)/', $data, $matches, PREG_SET_ORDER);
$dataSet = [];
foreach ($matches as $match) {
$dataSet[$match[1]] = $match[2];
}
print_r($dataSet);
Use preg_match_all() to identify the pieces of text you need:
$input = <<< E
dataCrossID=12345, DeviceID=[ID=1234567]
dataCrossID=5678, DeviceID=[ID=7654321]
dataCrossID=67899, DeviceID=[ID=87654321]
E;
preg_match_all('/dataCrossID=(\d+), DeviceID=\[ID=(\d+)\]/', $input, $matches, PREG_SET_ORDER);
print_r($matches);
The content of $matches is:
Array
(
[0] => Array
(
[0] => dataCrossID=12345, DeviceID=[ID=1234567]
[1] => 12345
[2] => 1234567
)
[1] => Array
(
[0] => dataCrossID=5678, DeviceID=[ID=7654321]
[1] => 5678
[2] => 7654321
)
[2] => Array
(
[0] => dataCrossID=67899, DeviceID=[ID=87654321]
[1] => 67899
[2] => 87654321
)
)
You can now iterate over $matches and use the values at positions 1 and 2 as keys and values to extract the data into the desired array:
$output = array_reduce(
$matches,
function(array $c, array $m) {
$c[$m[1]] = $m[2];
return $c;
},
array()
);
print_r($output);
The output is:
Array
(
[12345] => 1234567
[5678] => 7654321
[67899] => 87654321
)

Strange behavior of preg_match_all php

I have a very long string of html. From this string I want to parse pairs of rus and eng names of cities. Example of this string is:
$html = '
Абакан
Хакасия республика
Абан
Красноярский край
Абатский
Тюменская область
';
My code is:
$subject = $this->html;
$pattern = '/<a href="([\/a-zA-Z0-9-"]*)">([а-яА-Я]*)/';
preg_match_all($pattern, $subject, $matches);
For trying I use regexer . You can see it here http://regexr.com/399co
On the test used global modifier - /g
Because of in PHP we can't use /g modifier I use preg_match_all function. But result of preg_match_all is very strange:
Array
(
[0] => Array
(
[0] => <a href="/forecasts5000/russia/republic-khakassia/abakan">Абакан
[1] => <a href="/forecasts5000/russia/krasnoyarsk-territory/aban">Абан
[2] => <a href="/forecasts5000/russia/tyumen-area/abatskij">Аба�
[3] => <a href="/forecasts5000/russia/arkhangelsk-area/abramovskij-ma">Аб�
)
[1] => Array
(
[0] => /forecasts5000/russia/republic-khakassia/abakan
[1] => /forecasts5000/russia/krasnoyarsk-territory/aban
[2] => /forecasts5000/russia/tyumen-area/abatskij
[3] => /forecasts5000/russia/arkhangelsk-area/abramovskij-ma
)
[2] => Array
(
[0] => Абакан
[1] => Абан
[2] => Аба�
[3] => Аб�
)
)
First of all - it found only first match (but I need to get array with all matches)
The second - result is very strange for me. I want to get the next result:
pairs of /forecasts5000/russia/republic-khakassia/abakan and Абакан
What do I do wrong?
Element 0 of the result is an array of each of the full matches of the regexp. Element 1 is an array of all the matches for capture group 1, element 2 contains capture group 2, and so on.
You can invert this by using the PREG_SET_ORDER flag. Then element 0 will contain all the results from the first match, element 1 will contain all the results from the second match, and so on. Within each of these, [0] will be the full match, and the remaining elements will be the capture groups.
If you use this option, you can then get the information you want with:
foreach ($matches as $match) {
$url = $match[1];
$text = $match[2];
// Do something with $url and $text
}
You can also use T-Regx library which has separate methods for each case :)
pattern('<a href="([/a-zA-Z0-9-"]*)">([а-яА-Я]*)')
->match($this->html)
->forEach(function (Match $match) {
$match = $match->text();
$group = $match->group(1);
echo "Match $match with group $group"
});
I also has automatic delimiters

How to get a particular string using preg_replace?

i want to get a particular value from string in php. Following is the string
$string = 'users://data01=[1,2]/data02=[2,3]/*';
preg_replace('/(.*)\[(.*)\](.*)\[(.*)\](.*)/', '$2', $str);
i want to get value of data01. i mean [1,2].
How can i achieve this using preg_replace?
How can solve this ?
preg_replace() is the wrong tool, I have used preg_match_all() in case you need that other item later and trimmed down your regex to capture the part of the string you are looking for.
$string = 'users://data01=[1,2]/data02=[2,3]/*';
preg_match_all('/\[([0-9,]+)\]/',$string,$match);
print_r($match);
/*
print_r($match) output:
Array
(
[0] => Array
(
[0] => [1,2]
[1] => [2,3]
)
[1] => Array
(
[0] => 1,2
[1] => 2,3
)
)
*/
echo "Your match: " . $match[1][0];
?>
This enables you to have the captured characters or the matched pattern , so you can have [1,2] or just 1,2
preg_replace is used to replace by regular expression!
I think you want to use preg_match_all() to get each data attribute from the string.
The regex you want is:
$string = 'users://data01=[1,2]/data02=[2,3]/*';
preg_match_all('#data[0-9]{2}=(\[[0-9,]+\])#',$string,$matches);
print_r($matches);
Array
(
[0] => Array
(
[0] => data01=[1,2]
[1] => data02=[2,3]
)
[1] => Array
(
[0] => [1,2]
[1] => [2,3]
)
)
I have tested this as working.
preg_replace is for replacing stuff. preg_match is for extracting stuff.
So you want:
preg_match('/(.*?)\[(.*?)\](.*?)\[(.*?)\](.*)/', $str, $match);
var_dump($match);
See what you get, and work from there.

PHP - What regex code do I need to match this boundary sequence?

I have the following text string:
-asc100-17-asc100-17A-asc100-17BPH-asc100-17ASL
What regex code do I need to extract the values so that they appear in the matches array like this:
-asc100-17
-asc100-17A
-asc100-17BPH
-asc100-17ASL
Thanks in advance!
You may try this:
$str = "-asc100-17-asc100-17A-asc100-17BPH-asc100-17ASL";
preg_match_all('/-asc\d+-[0-9a-zA-Z]+/', $str, $matches);
// Print Result
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => -asc100-17
[1] => -asc100-17A
[2] => -asc100-17BPH
[3] => -asc100-17ASL
)
)
Based on the very limited information in your question, this works:
-asc100-17[A-Z]*
Debuggex Demo
If you want to capture the post -asc100- code, then use
-asc100-(17[A-Z]*)
Which places 17[the letters] into capture group one.
Might use preg_split with a lookahead as well for your scenario:
print_r(preg_split('/(?=-asc)/', $str, -1, PREG_SPLIT_NO_EMPTY));
Are you trying to break the string in an array? Then why regex is required? This function can handle what you want:
$arr = explode('-asc', '-asc100-17-asc100-17A-asc100-17BPH-asc100-17ASL');
foreach ($arr as $value) {
if(!empty($value)){
$final[] = '-asc'.$value;
}
}
print_r($final);
Output array : Array ( [0] => -asc100-17 [1] => -asc100-17A [2] => -asc100-17BPH [3] => -asc100-17ASL )

Regexp PHP Split number and string

I would like to split a string contains some numbers and letters. Like this:
ABCd Abhe123
123ABCd Abhe
ABCd Abhe 123
123 ABCd Abhe
I tried this:
<?php preg_split('#(?<=\d)(?=[a-z])#i', "ABCd Abhe 123"); ?>
But it doesn't work. Only one cell in array with "ABCd Abhe 123"
I would like for example, in cell 0: numbers and in cell1: string:
[0] => "123",
[1] => "ABCd Abhe"
Thank you for your help! ;)
Use preg_match_all instead
preg_match_all("/(\d+)*\s?([A-Za-z]+)*/", "ABCd Abhe 123" $match);
For every match:
$match[i][0] contains the matched segment
$match[i][1] contains numbers
$match[i][2] contains letters
(See here for regex test)
Then put them in an array
for($i = 0; $i < count($match); $i++)
{
if($match[i][1] != "")
$numbers[] = $match[1];
if($match[i][2] != "")
$letters[] = $match[2];
}
EDIT1
I've updated the regex. It now looks for either numbers or letters, with or without a whitespace.
EDIT2
The regex is correct, but the arrayhandling wasn't. Use preg_match_all, then $match is an array containing arrays, like:
Array
(
[0] => Array
(
[0] => Abc
[1] => aaa
[2] => 25
)
[1] => Array
(
[0] =>
[1] =>
[2] => 25
)
[2] => Array
(
[0] => Abc
[1] => aaa
[2] =>
)
)
Maybe something like this?
$numbers = preg_replace('/[^\d]/', '', $input);
$letters = preg_replace('/\d/', '', $input);

Categories