preg_match_all parsing, only one match

preg_match_all parsing, only one match - php

I have following input data(note that some of the whitespace is getting messed up):
aggr0_howzeg253_sata online raid_dp, aggr root, diskroot,
nosnap=off, raidtype=raid_dp,
32-bit raidsize=14, ignore_inconsistent=off,
snapmirrored=off, resyncsnaptime=60,
fs_size_fixed=off, snapshot_autodelete=on,
lost_write_protect=on, ha_policy=cfo,
hybrid_enabled=off, percent_snapshot_space=5%,
free_space_realloc=off
Volumes: root_vol_howzeg253, howzeg253_ixb_esx_vol16_sv_mirror,
howzeg253_ixb_esx_vol5_sv_mirror,
howzeg253_ixb_esx_vol18_sv_mirror,
howzeg253_ixb_esx_vol21_sv_mirror,
howzeg253_ixb_esx_vol33_sv_mirror,
howzeg253_ixb_esx_vol24_sv_mirror,
howzeg253_ixb_esx_vol34_sv_mirror
Plex /aggr0_howzeg253_sata/plex0: online, normal, active
RAID group /aggr0_howzeg253_sata/plex0/rg0: normal, block checksums
RAID group /aggr0_howzeg253_sata/plex0/rg1: normal, block checksums
aggr1_howzeg253_sata online raid_dp, aggr nosnap=off,
raidtype=raid_dp, raidsize=14,
32-bit ignore_inconsistent=off, snapmirrored=off,
resyncsnaptime=60, fs_size_fixed=off,
snapshot_autodelete=on, lost_write_protect=on,
ha_policy=cfo, hybrid_enabled=off,
percent_snapshot_space=5%,
free_space_realloc=off
Volumes: howzeg253_ixb_esx_vol6_sv_mirror,
howzeg253_ixb_esx_vol17_sv_mirror,
howzeg253_ixb_esx_vol7_sv_mirror,
howzeg253_ixb_esx_vol19_sv_mirror,
howzeg253_ixb_esx_vol23_sv_mirror,
howzeg253_ixb_esx_vol8_sv_mirror,
howzeg253_ixb_esx_vol36_sv_mirror
Plex /aggr1_howzeg253_sata/plex0: online, normal, active
RAID group /aggr1_howzeg253_sata/plex0/rg0: normal, block checksums
RAID group /aggr1_howzeg253_sata/plex0/rg1: normal, block checksums
I use this expression with preg_match_all:
preg_match_all("|(aggr[a-z0-9_]+)\s+.*Volumes.\s+(.*)\s+Plex.*checksums|s", $rawdata, $out);
However the output that I get only gives me the information from the first block(which seems parsed correctly; each block starts with aggr_... at the beginning of a line).
I tried different ways but couldn't get what I wanted(like multiline with the carrot at the beginning of the expression, and the s modifier).
So this is the output I get:
...
[1] => Array
(
[0] => aggr4_delng153_sas_sata
)
[2] => Array
(
[0] => delng153_ixb_esx_vol19, delng153_ixb_esx_vol20,
delng153_ixb_esx_vol21, delng153_ixb_esx_vol28,
delng153_ixb_esx_vol29, delng153_ixb_esx_vol30,
delng153_ixb_esx_vol31
)
I want the second block to be returned as well.
Can anyone help me out here?
Thanks in advance!

Try this:
preg_match_all("|(aggr[a-z0-9_]+)\s+.*Volumes.\s+(.*)\s+Plex.*checksums|sg", $rawdata, $out);
The global g search flag makes the RegExp search for a pattern throughout the whole string

Bro I think you are looking for this
'/(aggr[a-z0-9_]+)\s.*?Volumes:(.*?)rg1: normal, block checksums/s'

Really stupid that I yesterday evening couldn't find it.
This morning I looked 5 minutes and was like: no, will post it.
5 minutes later I found that with using modifier U it worked.
However I'll look into your proposed solutions as well.
It seems like Naveed's answer would return to much (block until the words "block cheksums") however I might be wrong.

Related

I am getting a error using php (str_replace)

I am getting a error while using php str_replace function.
I am reading out a string in a different file a JSON
and if I remove the str_replace part it works without the error but I want to make the ** go to bold if there are any other ways you know you can also just tell that.
<?php
$data = json_decode($readjson, true);
echo "<br/><br<br/>";
foreach ($data as $emp) {
echo str_replace("**","<strong>","$data"), $emp['message']."<br/>";
}
?>
And the output is
Notice: Array to string conversion in C:\Users\k-ver\Dropbox\Other\website stuff or smth\r3mind3r\changelog.php on line 16
Array - Weekley Update - Another great week at our side! We have made enournous advances in synching with the raspberry pi (the computer we are going to host from) and are closer than ever to our promised release We have also been fixing on the mute commands and are very close to making it work, aswell with unmute command.language feature is closing up on complete and about 70% of the bot has the language system working. We also made a new system that should be easier to use for bouth us devs and the translators. All thats left for the release atm is: -finishing synching -fixing the mute command and unmute command -make a functioning permissonlevel system -adding those last 30% of the bot that does not have the translationsystem in place. and the bot will have its huge release! (about time if you asked me)
(it is for a dev log)
and the part I don't understand is the notice and I also don't understand how to fix it
It would be awesome if you guys would like to help me.

This variable should be string value e.g $emp['message'] not the multi-dimensional array $data.
// see this line with $emp['message'] not $data array
str_replace("**","<strong>",$emp['message']);
EDIT: As per comment
<?php
$string = '{
"188762891137056769": {
"message": "\n**- Weekley Update -**\nAnother great week at our side! \nWe have made enournous advances in synching with the raspberry pi *(the computer we are going to host from)* and are closer than ever to our promised release\n\nWe have also been fixing on the mute commands and are very close to making it work, aswell with unmute command.language feature is closing \nup on complete and about 70% of the bot has the language system working. We also made a new system that should be easier to use for bouth \nus devs and the translators.\n\n**All thats left for the release atm is:**\n-finishing synching \n-fixing the mute command and unmute command \n-make a functioning permissonlevel system\n-adding those last 30% of the bot that does not have the translationsystem in place. \n\nand the bot will have its huge release! *(about time if you asked me)*"
}
}';
$array = json_decode($string,1);
$message = $array['188762891137056769']['message'];
$re = '/\*\*(.*?)\*\*/m';
$subst = '<strong>$1</strong>';
echo preg_replace($re, $subst, $message);
?>
DEMO: https://3v4l.org/ovhGq

You used array inside str_replace("**","","$data") this is wrong, how you can fix it just replace $data with $emp

Your code is:
foreach ($data as $emp) {
echo str_replace("**","<strong>","$data"), $emp['message']."<br/>";
}
You see $data is an array, an $emp is the current element within the foreach loop.
So, you should do: str_replace("", "", $emp)
By the way, I see this: $emp['message'], which means $emp is an array too?
Maybe you should post the $readjson variable, so we'll know what type of data is.

If you want to enclose the text between ** with <strong></strong>, you need to use a regex. Here is a little code that does what you want :
function boldify($text) {
return preg_replace('/\*\*((.|\n|\r)*)\*\*/imU', '<strong>$1</strong>', $text);
}
Basically, it uses the function preg_replace to replace according to the regex (the first parameter).
How does this regex work :
1) You have \*\* at the beginning because that's your "opening tag". (* is a special regex character, so it needs escaping.)
2) You have ((.|\n|\r)*).
The inner part : .|\n|\r says "Catch me any character (the .) or (the |) a line feed (the \n) or a carriage return (the \r).".
Then you have the inner part enclosed with (inner part)*. This says "Match the inner part any number of time.".
Finally, you have the "middle part" enclosed with (middle part). This says "Remember what you just caught inside the parentheses, we will need it for the replace.
3) You have \*\* again.
4) All this is enclosed by /regex/imU.
The / are just there to say where the regex actually is.
The imU are flags: i is ignore case, m multiline, U ungreedy.
i are m are pretty straightforward, but U says "catch the smallest group possible".
As the second parameter you have '<strong>$1</strong>'. $1 is the group we remember from 2).
The third parameter is the subject.
I hope it was clear.
You can just use it like this :
echo boldify($emp['message']);

getCsvControl() always returns same delimiter - PHP

$file = new SplFileObject('D:\BackUp\addressbook.csv');
print_r($file->getCsvControl());
What i am trying to do is find the delimiter of a csv file using php. the addressbook.csv file looks like
"id";"firstname";"lastname";"phone";"email"
"1";"jishan";"ishrak";"17878";"jishan.ishrak#gmail.com"
and another file is addressbook1.csv which is like
"id","firstname","lastname","phone","email"
"1","jishan","ishrak","17878","jishan.ishrak#gmail.com"
one is separated by "," and another one is with ";" but the function
getCsvControl()
always returns an array like
Array ( [0] => , [1] => " )
I mean in the [0] index it always gives "," for both files
is there a way to solve this issue.

This is not a bug. SplFileObject::getCsvControl() is never intended to detect the delimiter from a CSV file. It returns only the default control characters or the one previously set with SplFileObject::setCsvControl(). And this set CSV control characters are used, if is nothing handed over in the SplFileObject::fgetcsv() method.
Ok, it's badly documented, but this were my first thoughts, the method would never detect the characters and a look into the php source code confirmed this.

Proabably this is a bug?
as you can see here php doc 1st comment 1 year ago - Seems that this function always returns the same delimiter.
UPDATE
this is not a bug look at Pazi ツ answer.

PHP Regex search failing when adding a second capture group

I have the following named capture group that works exactly as intended. It grabs the last date/time of a specific format from a string of text.
$re = "/.*(?<date>[0-9]+\\/[0-9]+\\/[0-9]+ [0-9]+:[0-9]+:[0-9]{1,2} (AM|PM))/s";
I want to capture the user ID that follows so I changed it to the following
$re = "/.*(?<date>[0-9]+\\/[0-9]+\\/[0-9]+ [0-9]+:[0-9]+:[0-9]{1,2} (AM|PM)) (?<name>\\w+)/s";
However when I do so it breaks both values giving the following error
Notice: Undefined index: date in Q:\XAMPP\htdocs\index.php on line 272
The Error stems from the preg_match matches array being blank. Print_r confirms the array does not contain any information once the regex is changed to the second value.
Both of these work fine in external sites as the link below to Regex101 shows
http://regex101.com/r/zO0mK0/4
Using PHP 5.5.9
So the question is, am I missing something in this regex statement that is breaking it between the external site and my internal code or does this work meaning it is 100% purely my php that is causing this issue.
$LastDateRegex = "/.*(?<date>[0-9]+\\/[0-9]+\\/[0-9]+ [0-9]+:[0-9]+:[0-9]{1,2} (AM|PM))/s";
preg_match($LastDateRegex, $arr2['WorkLog'], $LastDateMatches);
$Modsecs = (strtotime($ts) - strtotime($LastDateMatches['date']))%60;
This is an example of the code being used. As mentioned above, I know the error stems from the $LastDateMatches array being empty for the second regex example, however the code works 100% with the first so there is something between the two that causes the issue.

You have an extra \ here: \\w+ and \\ at a few places, if that makes a difference?
Not quite sure what you want returned but runnning this regular expression, where $str is the text you have in your regex101-link,
$regex = "/.*(?<date>\d+\/\d+\/\d+ \d+:\d+:\d{1,2} (?:AM|PM)) (?<name>\w+)/s";
preg_match($regex, $str, $LastDateMatches);
Output,
Array
(
[0] => ...
[date] => 6/20/2014 10:04:32 PM
[1] => 6/20/2014 10:04:32 PM
[name] => ihugett
[2] => ihugett
)

add ' to front of string from vector

I need to print the results from a html parsing into an PHP array. I am stuck at the very last part.
library(XML)
url ='http://www.brainyquote.com/quotes/authors/j/john_kenneth_galbraith.html'
page <- htmlParse(url)
quote <- xpathSApply(page,
"//text()[not(ancestor::script)][not(ancestor::style)][not(ancestor::noscript)][not(ancestor::form)]",
xmlValue)
quote = quote[nchar(quote) > 50] # this removes all none quotes based on string length
quote = quote[1:(length(quote)-2)] # this drops the last
out = paste(quote, collaspe= "', ") # how to get the ' at the front of the quote
write(out, "quote.txt")
The final code has the text string with --- quote here---',(apostrophe - comma) at the end. I need to put the '(apostrophe) at the beginning and do not have idea how to do it. I tried using r to json but does not work for simple php array I use. which is structure like this:
<?php
$quotes = array('quote goes here', 'quote goes here', 'final quote');
$rand = rand( 0, count($quotes)-1 );
echo $quotes[$rand];
?>
I do not really use php but it just runs on everything so I did this random quote maker is simple terms. I could rewrite in javascript and use a json array. But I would then need to write in javascript.

You can simplify the extraction somewhat (maybe this takes care of your apostrophe etc ?)
library(XML)
url ='http://www.brainyquote.com/quotes/authors/j/john_kenneth_galbraith.html'
page <- htmlParse(url)
out <- sapply(page['//span[#class="bqQuoteLink"]/a'], xmlValue)
write(out, "quote.txt")
> head(paste0("'", out, "'"))
[1] "'The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is, the search for a superior moral justification for selfishness.'"
[2] "'Economics is extremely useful as a form of employment for economists.'"
[3] "'All of the great leaders have had one characteristic in common: it was the willingness to confront unequivocally the major anxiety of their people in their time. This, and not much else, is the essence of leadership.'"
[4] "'In economics, the majority is always wrong.'"
[5] "'Faced with the choice between changing one's mind and proving that there is no need to do so, almost everyone gets busy on the proof.'"
[6] "'Under capitalism, man exploits man. Under communism, it's just the opposite.'"

Extract Relevant Tag/Keywords from Text block

I wanted a particular implementation, such that the user provide a block of text like:
"Requirements
- Working knowledge, on LAMP Environment using Linux, Apache 2,
MySQL 5 and PHP 5,
- Knowledge of Web 2.0 Standards
- Comfortable with JSON
- Hands on Experience on working with Frameworks, Zend, OOPs
- Cross Browser Javascripting, JQuery etc.
- Knowledge of Version Control Software such as sub-version will be
preferable."
What I want to do is automatically select relevant keywords and create tags/keywords, hence for the above piece of text, relevant tags should be: mysql, php, json, jquery, version control, oop, web2.0, javascript
How can I go about doing it in PHP/Javascript etc? A headstart would be really helpful.

A very naive method is to remove common stopwords from the text, leaving you with more meaningful words like 'Standards', 'JSON', etc. You will still get a lot of noise however, so you may consider a service like OpenCalais which can do a rather sophisticated analysis of your text.
Update:
Okay, the link in my previous answer pointed to implementations, but you asked for one so a simple one is here:
function stopWords($text, $stopwords) {
// Remove line breaks and spaces from stopwords
$stopwords = array_map(function($x){return trim(strtolower($x));}, $stopwords);
// Replace all non-word chars with comma
$pattern = '/[0-9\W]/';
$text = preg_replace($pattern, ',', $text);
// Create an array from $text
$text_array = explode(",",$text);
// remove whitespace and lowercase words in $text
$text_array = array_map(function($x){return trim(strtolower($x));}, $text_array);
foreach ($text_array as $term) {
if (!in_array($term, $stopwords)) {
$keywords[] = $term;
}
};
return array_filter($keywords);
}
$stopwords = file('stop_words.txt');
$text = "Requirements - Working knowledge, on LAMP Environment using Linux, Apache 2, MySQL 5 and PHP 5, - Knowledge of Web 2.0 Standards - Comfortable with JSON - Hands on Experience on working with Frameworks, Zend, OOPs - Cross Browser Javascripting, JQuery etc. - Knowledge of Version Control Software such as sub-version will be preferable.";
print_r(stopWords($text, $stopwords));
You can see this, and the contents of stop_word.txt in this Gist.
Running the above on your example text produces the following array:
Array
(
[0] => requirements
[4] => linux
[6] => apache
[10] => mysql
[13] => php
[25] => json
[28] => frameworks
[30] => zend
[34] => browser
[35] => javascripting
[37] => jquery
[38] => etc
[42] => software
[43] => preferable
)
So, like I said, this is somewhat naive and could use more optimization (plus it's slow) but it does pull out the more relevant keywords from your text. You would need to do some fine tuning on the stop words as well. Capturing terms like Web 2.0 will be very difficult, so again I think you would be better off using a serious service like OpenCalais which can understand a text and return a list of entities and references. DocumentCloud relies on this very service to gather information from documents.
Also, for client side implementation you could do pretty much the same thing with JavaScript, and probably much cleaner (although it could be slow for the client.)

I did a quick review of these this morning and to my surprise one which performs best with my test phrase was written in PHP
http://code.fivefilters.org/term-extraction
demo: http://fivefilters.org/term-extraction/
What looked like the most professional one performed abysmally: viewer.opencalais.com
Others that were OK were (not sure what language they're written in)
www.nactem.ac.uk/software/termine/#form
www.alchemyapi.com/api/keyword/

This is not easy to do because it requires some type of fuzzy logic. You should use the Yahoo Term extractor YQL
Check it out: link

Depending on whether you want to show the client keywords/tags or whether you want to extract the keywords / tags from the block of text then do further computation with them.
If you only need to show them then clientside handling is fine. If you need them for further computation then use serverside handling for it.
I can recommend a javascript clientside implementation if you can supply some more details. If you want to generically "know" the keywords then some kind of clever solution is neccesary
If you have a list of keywords then you can use regular expressions to extract the data

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

preg_match_all parsing, only one match - php

Try this: preg_match_all("|(aggr[a-z0-9_]+)\s+.Volumes.\s+(.)\s+Plex.*checksums|sg", $rawdata, $out); The global g search flag makes the RegExp search for a pattern throughout the whole string

Bro I think you are looking for this '/(aggr[a-z0-9_]+)\s.?Volumes:(.?)rg1: normal, block checksums/s'

Related

I am getting a error using php (str_replace)

getCsvControl() always returns same delimiter - PHP

PHP Regex search failing when adding a second capture group

add ' to front of string from vector

Extract Relevant Tag/Keywords from Text block

Categories

Resources

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

preg_match_all parsing, only one match - php

Try this: preg_match_all("|(aggr[a-z0-9_]+)\s+.*Volumes.\s+(.*)\s+Plex.*checksums|sg", $rawdata, $out); The global g search flag makes the RegExp search for a pattern throughout the whole string

Bro I think you are looking for this '/(aggr[a-z0-9_]+)\s.*?Volumes:(.*?)rg1: normal, block checksums/s'

Related

I am getting a error using php (str_replace)

getCsvControl() always returns same delimiter - PHP

PHP Regex search failing when adding a second capture group

add ' to front of string from vector

Extract Relevant Tag/Keywords from Text block

Categories

Resources

Try this: preg_match_all("|(aggr[a-z0-9_]+)\s+.Volumes.\s+(.)\s+Plex.*checksums|sg", $rawdata, $out); The global g search flag makes the RegExp search for a pattern throughout the whole string

Bro I think you are looking for this '/(aggr[a-z0-9_]+)\s.?Volumes:(.?)rg1: normal, block checksums/s'