Get data out of string - php

I am going to parse a log file and I wonder how I can convert such a string:
[5189192e][game]: kill killer='0:Tee' victim='1:nameless tee' weapon=5 special=0
into some kind of array:
$log['5189192e']['game']['killer'] = '0:Tee';
$log['5189192e']['game']['victim'] = '1:nameless tee';
$log['5189192e']['game']['weapon'] = '5';
$log['5189192e']['game']['special'] = '0';

The best way is to use function preg_match_all() and regular expressions.
For example to get 5189192e you need to use expression
/[0-9]{7}e/
This says that the first 7 characters are digits last character is e you can change it to fits any letter
/[0-9]{7}[a-z]+/
it is almost the same but fits every letter in the end
more advanced example with subpatterns and whole details
<?php
$matches = array();
preg_match_all('\[[0-9]{7}e\]\[game]: kill killer=\'([0-9]+):([a-zA-z]+)\' victim=\'([0-9]+):([a-zA-Z ]+)\' weapon=([0-9]+) special=([0-9])+\', $str, $matches);
print_r($matches);
?>
$str is string to be parsed
$matches contains the whole data you needed to be pared like killer id, weapon, name etc.

Using the function preg_match_all() and a regex you will be able to generate an array, which you then just have to organize into your multi-dimensional array:
here's the code:
$log_string = "[5189192e][game]: kill killer='0:Tee' victim='1:nameless tee' weapon=5 special=0";
preg_match_all("/^\[([0-9a-z]*)\]\[([a-z]*)\]: kill (.*)='(.*)' (.*)='(.*)' (.*)=([0-9]*) (.*)=([0-9]*)$/", $log_string, $result);
$log[$result[1][0]][$result[2][0]][$result[3][0]] = $result[4][0];
$log[$result[1][0]][$result[2][0]][$result[5][0]] = $result[6][0];
$log[$result[1][0]][$result[2][0]][$result[7][0]] = $result[8][0];
$log[$result[1][0]][$result[2][0]][$result[9][0]] = $result[10][0];
// $log is your formatted array

You definitely need a regex. Here is the pertaining PHP function and here is a regex syntax reference.

Related

How to get text after string using reg expression in PHP?

I have the following string:
<p><b>Born:</b>333<br></p>
I try to get text 333 like as:
<b>Born:<\/b>(.)*<br>
But it does not work
The . is any character in a string, * means that we concern the repetition. Brackets determine a group to output.
You've used (.)* formula, what means that you will get only the last character (regex from your post should output 3). If you want to output the whole expression 333, try putting everything in one group using (.*?).
Use this regular expression instead,
/<b>Born:<\/b>(.*?)<br>/
Here's an example,
$reg = "/<b>Born:<\/b>(.*?)<br>/";
$str = "<p><b>Born:</b>333<br></p>";
$matches = array();
preg_match($reg, $str, $matches);
echo $matches[1]; // 333
Here's the live demo
You could try something like this:
<?php
$string = "<p><b>Born:</b>333<br></p>";
$extract = preg_replace("#(<p>.*?<\/b>)(.*?)(<br.+>)#", "$2", $string);
var_dump($extract); //<== DISPLAYS::: string '333' (length=3)
You should avoid to parse html with regex since it's a bad practice (html has too many traps, you doesn't take advantage of the html structure and when html isn't well formatted the string approach stops to work). The way to go is to use a tool designed to parse html. The combo DOMDocument/DOMXPath is able to build a DOM tree and to query it using the XPath language:
$str = "<p><b>Born:</b> 333<br></p>";
libxml_use_internal_errors(true);
$xp = new DOMXPath(DOMDocument::loadHTML($str));
$result = $xp->evaluate('string(//b[.="Born:"]/following-sibling::text()[1])');
libxml_clear_errors();
echo trim($result);

Matching a substring (an apostrophe) in a given word using regex

I have a server application which looks up where the stress is in Russian words. The end user writes a word жажда. The server downloads a page from another server which contains the stresses indicated with apostrophes for each case/declension like this жа'жда. I need to find that word in the downloaded page.
In Russian the stress is always written after a vowel. I've been using so far a regex that is a grouping of all possible combinations (жа'жда|жажда'). Is there a more elegant solution using just a regex pattern instead of making a PHP script which creates all these combinations?
EDIT:
I have a word жажда
The downloaded page contains the string жа'жда. (notice the
apostrophe, I do not before-hand know where the apostrophe in the
word is)
I want to match the word with apostrophe (жа'жда).
P.S.: So far I have a PHP script creating the string (жа'жда|жажда') used in regex (apostrophe is only after vowels) which matches it. My goal is to get rid of this script and use just regex in case it's possible.
If I understand your question,
have these options (d'isorder|di'sorder|dis'order|diso'rder|disor'der|disord'er|disorde'r|disorder‌​') and one of these is in the downloaded page and I need to find out which one it is
this may suit your needs:
<pre>
<?php
$s = "d'isorder|di'sorder|dis'order|diso'rder|disor'der|disord'er|disorde'r|disorder'|disorde'";
$s = explode("|",$s);
print_r($s);
$matches = preg_grep("#[aeiou]'#", $s);
print_r($matches);
running example: https://eval.in/207282
Uhm... Is this ok with you?
<?php
function find_stresses($word, $haystack) {
$pattern = preg_replace('/[aeiou]/', '\0\'?', $word);
$pattern = "/\b$pattern\b/";
// word = 'disorder', pattern = "diso'?rde'?r"
preg_match_all($pattern, $haystack, $matches);
return $matches[0];
}
$hay = "something diso'rder somethingelse";
find_stresses('disorder', $hay);
// => array(diso'rder)
You didn't specify if there can be more than one match, but if not, you could use preg_match instead of preg_match_all (faster). For example, in Italian language we have àncora and ancòra :P
Obviously if you use preg_match, the result would be a string instead of an array.
Based, on your code, and the requirements that no function is called and disorder is excluded. I think this is what you want. I have added a test vector.
<pre>
<?php
// test code
$downloadedPage = "
there is some disorde'r
there is some disord'er in the example
there is some di'sorder in the example
there also' is some order in the example
there is some disorder in the example
there is some dso'rder in the example
";
$word = 'disorder';
preg_match_all("#".preg_replace("#[aeiou]#", "$0'?", $word)."#iu"
, $downloadedPage
, $result
);
print_r($result);
$result = preg_grep("#'#"
, $result[0]
);
print_r($result);
// the code you need
$word = 'also';
preg_match("#".preg_replace("#[aeiou]#", "$0'?", $word)."#iu"
, $downloadedPage
, $result
);
print_r($result);
$result = preg_grep("#'#"
, $result
);
print_r($result);
Working demo: https://eval.in/207312

Trying to grab a string with one substring static, and one dynamic

I'm trying to find the best way to grab the dynamic substring, but replace all of the content after.
This is what I'm trying to achieve:
{table_telecommunications}
The substring {table_ is always the same, the only that varies is telecommunications}.
I want to grab the word telecommunications so I can do a search on a MySQL table and then replace {table_telecommunications} with the content returned.
I thought of making a strpos and then explode and so on.
But I guess it would be easier with regex, but I have no skills on creating regex.
Could you possibly give me the best way to do this?
Edit: I'm saying possibly regex is the best way because I need to find strings that are in this format, but the second part is variable, just like {table_*}
Use Regex.
<?php
$string = "{table_telecommunications} blabla blabla {table_block}";
preg_match_all("/\{table_(.+?)\}/is", $string, $matches);
$substrings = $matches[1];
print_r($substrings);
?>
if (preg_match('#table_([^}]+)}#', '{table_telecommunications}', $matches)){
echo $matches[1];
}
That's a regex solution. You can do the same with explode:
$parts = explode('table_', '{table_telecommunications}');
echo substr($parts[1], 0, -1);
$input = '{table_telecommunications}';
$table_name = trim(implode('_', array_shift(explode($input, '_'))), '}');
Should be fast, no regex required.

PHP : Get a number between 2 strings

I have this string:
a:3:{i:0;i:2;i:1;i:3;i:2;i:4;}
I want to get number between "a:" and ":{" that is "3".
I try to user substr and strpos but no success.
I'm newbie in regex , write this :
preg_match('/a:(.+?):{/', $v);
But its return me 1.
Thanks for any tips.
preg_match returns the number of matches, in your case 1 match.
To get the matches themselves, use the third parameter:
$matches = array();
preg_match(/'a:(\d+?):{/', $v, $matches);
That said, I think the string looks like a serialized array which you could deserialize with unserialize and then use count on the actual array (i.e. $a = count(unserialize($v));). Be careful with userprovided serialized strings though …
If you know that a: is always at the beginning of the string, the easiest way is:
$array = explode( ':', $string, 3 );
$number = $array[1];
You can use sscanfDocs to obtain the number from the string:
# Input:
$str = 'a:3:{i:0;i:2;i:1;i:3;i:2;i:4;}';
# Code:
sscanf($str, 'a:%d:', $number);
# Output:
echo $number; # 3
This is often more simple than using preg_match when you'd like to obtain a specific value from a string that follows a pattern.
preg_match() returns the number of times it finds a match, that's why. you need to add a third param. $matches in which it will store the matches.
You were not too far away with strpos() and substr()
$pos_start = strpos($str,'a:')+2;
$pos_end = strpos($str,':{')-2;
$result = substr($str,$pos_start,$pos_end);
preg_match only checks for appearance, it doesn't return any string.

Parse multiple predictably formatted substrings of user data existing in a single string

I have a really long string in a certain pattern such as:
userAccountName: abc userCompany: xyz userEmail: a#xyz.com userAddress1: userAddress2: userAddress3: userTown: ...
and so on. This pattern repeats.
I need to find a way to process this string so that I have the values of userAccountName:, userCompany:, etc. (i.e. preferably in an associative array or some such convenient format).
Is there an easy way to do this or will I have to write my own logic to split this string up into different parts?
Simple regular expressions like this userAccountName:\s*(\w+)\s+ can be used to capture matches and then use the captured matches to create a data structure.
If you can arrange for the data to be formatted as it is in a URL (ie, var=data&var2=data2) then you could use parse_str, which does almost exactly what you want, I think. Some mangling of your input data would do this in a straightforward manner.
You might have to use regex or your own logic.
Are you guaranteed that the string ": " does not appear anywhere within the values themselves? If so, you possibly could use implode to split the string into an array of alternating keys and values. You'd then have to walk through this array and format it the way you want. Here's a rough (probably inefficient) example I threw together quickly:
<?php
$keysAndValuesArray = implode(': ', $dataString);
$firstKeyName = 'userAccountName';
$associativeDataArray = array();
$currentIndex = -1;
$numItems = count($keysAndValuesArray);
for($i=0;$i<$numItems;i+=2) {
if($keysAndValuesArray[$i] == $firstKeyName) {
$associativeDataArray[] = array();
++$currentIndex;
}
$associativeDataArray[$currentIndex][$keysAndValuesArray[$i]] = $keysAndValuesArray[$i+1];
}
var_dump($associativeDataArray);
If you can write a regexp (for my example I'm considering there're no semicolons in values), you can parse it with preg_split or preg_match_all like this:
<?php
$raw_data = "userAccountName: abc userCompany: xyz";
$raw_data .= " userEmail: a#xyz.com userAddress1: userAddress2: ";
$data = array();
// /([^:]*\s+)?/ part works because the regexp is "greedy"
if (preg_match_all('/([a-z0-9_]+):\s+([^:]*\s+)?/i', $raw_data,
$items, PREG_SET_ORDER)) {
foreach ($items as $item) {
$data[$item[1]] = $item[2];
}
print_r($data);
}
?>
If that's not the case, please describe the grammar of your string in a bit more detail.
PCRE is included in PHP and can respond to your needs using regexp like:
if ($c=preg_match_all ("/userAccountName: (<userAccountName>\w+) userCompany: (<userCompany>\w+) userEmail: /", $txt, $matches))
{
$userAccountName = $matches['userAccountName'];
$userCompany = $matches['userCompany'];
// and so on...
}
the most difficult is to get the good regexp for your needs.
you can have a look at http://txt2re.com for some help
I think the solution closest to what I was looking for, I found at http://www.justin-cook.com/wp/2006/03/31/php-parse-a-string-between-two-strings/. I hope this proves useful to someone else. Thanks everyone for all the suggested solutions.
If i were you, i'll try to convert the strings in a json format with some regexp.
Then, simply use Json.

Categories