I have an object which has a state property, for example state = 'state4' or state = 'state2'.
Now I also have an array of all available states that the state property can get, state1 to state8 (note: the states are not named stateN. They have eight different names, like payment or canceled. I just put stateN to describe the problem).
In addition to that, I have a logical expression like $expression = !state1||state4&&(!state2||state5) for example. This is the code for the above description:
$state = 'state4';
$expression = '!state1||state4&&(!state2||state5)';
Now I want to check if the logical expression is true or false. In the above case, it's true. In the following case it would be false:
$state = 'state1';
$expression = state4&&!state2||(!state1||state7);
How could this be solved in an elegant way?
//Initialize
$state = 'state4';
$expression = '!state1||state4&&(!state2||state5)';
//Adapt to your needs
$pattern='/state\d/';
//Replace
$e=str_replace($state,'true',$expression);
while (preg_match_all($pattern,$e,$matches)
$e=str_replace($matches[0],'false',$e);
//Eval
eval("\$result=$e;");
echo $result;
Edit:
Your update to the OQ necessitates some minor work:
//Initialize
$state = 'payed';
$expression = '!payed||cancelled&&(!whatever||shipped)';
//Adapt to your needs
$possiblestates=array(
'payed',
'cancelled',
'shipped',
'whatever'
);
//Replace
$e=str_replace($state,'true',$expression);
$e=str_replace($possiblestates,'false',$e);
//Eval
eval("\$result=$e;");
echo $result;
Edit 2
There has been concern about eval and PHP injection in the comments: The expression and the replacements are completly controlled by the application, no user input involved. As long as this holds, eval is safe.
I am using ExpressionLanguage, but there are few different solutions
ExpressionLanguage Symfony Component - https://symfony.com/doc/current/components/expression_language.html
cons - weird array syntax - array['key']. array.key works only for objects
cons - generate notice for array['key'] when key is not defined
pros - stable and well maintainer
https://github.com/mossadal/math-parser
https://github.com/optimistex/math-expression
Please remember that eval is NOT an option, under NO circumstances. We don't live an a static world. Any software always grows and evolves. What was once considered a safe input an one point may turn completely unsafe and uncontrolled.
I think you have a case which can be solved if you model each of your expressions as a rooted directed acyclic graph (DAG).
I assumed acyclic since your ultimate aim is to find the result of boolean algebra operations (if cycling occur in any of graph, then it'd be nonsense I think).
However, even if your graph structure—meaningfully—can be cyclic then your target search term will be cyclic graph, and it should still have a solution.
$expression = '!state1||state4&&(!state2||state5)';
And you have one root with two sub_DAGs in your example.
EXPRESSION as a Rooted DAG:
EXPRESSION
|
AND
___/ \___
OR OR
/ \ / \
! S_1 S_4 ! S_2 S5
Your adjacency list is:
expression_adj_list = [
expression => [ subExp_1, subExp_2 ] ,
subExp_1 => [ ! S_1, S_4 ],
subExp_2 => [ ! S_2, S5 ]
]
Now you can walk through this graph by BFS (breadth-first search algorithm) or DFS (depth-first search algorithm) or your custom, adjusted algorithm.
Of course you can just visit the adjacency list with keys and values as many times as you need if this suits and is easier for you.
You'll need a lookup table to teach your algorithm that. For example,
S2 && S5 = 1,
S1 or S4 = 0,
S3 && S7 = -1 (to throw an exception maybe)
After all, the algorithm below can solve your expression's result.
$adj_list = convert_expression_to_adj_list();
// can also be assigned by a function.
// root is the only node which has no incoming-edge in $adj_list.
$root = 'EXPRESSION';
q[] = $root; //queue to have expression & subexpressions
$results = [];
while ( ! empty(q)) {
$current = array_shift($q);
if ( ! in_array($current, $results)) {
if (isset($adj_list[$current])) { // if has children (sub/expression)
$children = $adj_list[$current];
// true if all children are states. false if any child is subexpression.
$bool = is_calculateable($children);
if ($bool) {
$results[$current] = calc($children);
}
else {
array_unshift($q, $current);
}
foreach ($children as $child) {
if (is_subexpresssion($child) && ! in_array($child, $results)) {
array_unshift($q, $child);
}
}
}
}
}
return $results[$root];
This approach has a great advantage also: if you save the results of the expressions in your database, if an expression is a child of the root expression then you won't need to recalculate it, just use the result from the database for the child subexpressions. In this way, you always have a two-level depth DAG (root and its children).
Related
Mr. Wiktor, this question is by no means a duplicate of the question you juxtaposed in your justification of marking this a duplicate.
To wit, the question you pointed to asks what's the countpart of preg_match in python. I, even in the TITLE ITSELF mention the "re.search" which was the answer to the thread you mentioned. I'm aware of re.search
My question is SPECIFICALLY to how I can use the 3rd argument in re.search the way that its counterpart in php is used in the example I provided. Mr. Wiktor, I respectfully request that you unmark my thread as duplicate Thank you in advance sir.
What I'm trying to do is Stemming (NLP) for the Greek language in python. The php code is this:
protected static $step1list = array(
"φαγια"=>"φα",
"φαγιου"=>"φα",
"φαγιων"=>"φα",
"σκαγια"=>"σκα",
"σκαγιου"=>"σκα",
"σκαγιων"=>"σκα",
"ολογιου"=>"ολο",
"ολογια"=>"ολο",
"ολογιων"=>"ολο",
"σογιου"=>"σο",
"σογια"=>"σο",
"σογιων"=>"σο",
"τατογια"=>"τατο",
"τατογιου"=>"τατο",
"τατογιων"=>"τατο",
"κρεασ"=>"κρε",
"κρεατοσ"=>"κρε",
"κρεατα"=>"κρε",
"κρεατων"=>"κρε",
"περασ"=>"περ",
"περατοσ"=>"περ",
"περατα"=>"περ",
"περατων"=>"περ",
"τερασ"=>"τερ",
"τερατοσ"=>"τερ",
"τερατα"=>"τερ",
"τερατων"=>"τερ",
"φωσ"=>"φω",
"φωτοσ"=>"φω",
"φωτα"=>"φω",
"φωτων"=>"φω",
"καθεστωσ"=>"καθεστ",
"καθεστωτοσ"=>"καθεστ",
"καθεστωτα"=>"καθεστ",
"καθεστωτων"=>"καθεστ",
"γεγονοσ"=>"γεγον",
"γεγονοτοσ"=>"γεγον",
"γεγονοτα"=>"γεγον",
"γεγονοτων"=>"γεγον"
);
protected static $step1regexp="/(.*)(φαγια|φαγιου|φαγιων|σκαγια|σκαγιου|σκαγιων|ολογιου|ολογια|ολογιων|σογιου|σογια|σογιων|τατογια|τατογιου|τατογιων|κρεασ|κρεατοσ|κρεατα|κρεατων|περασ|περατοσ|περατα|περατων|τερασ|τερατοσ|τερατα|τερατων|φωσ|φωτοσ|φωτα|φωτων|καθεστωσ|καθεστωτοσ|καθεστωτα|καθεστωτων|γεγονοσ|γεγονοτοσ|γεγονοτα|γεγονοτων)$/u";
$w;
$stem="";
$suffix="";
$firstch="";
if (preg_match($step1regexp, $w, $fp)) {
$stem = $fp[1];
$suffix = $fp[2];
$w = $stem.$step1list[$suffix];
}
The latest thing i've tried is this (i don't rly have blah on the lists, they're the same as the php one):
import re
step1list = {
u"φαγια": u"φα",
blah blah blah blah
}
stem = ""
suffix=""
firstch=""
s = u"σογια"
reg = re.compile(r'/(.*)(φαγια|φαγιου|φαγιων|σκαγια|σκαγιου|σκαγιων|ολογιου|ολογια|ολογιων|σογιου|σογια|σογιων|τατογια|τατογιου|τατογιων|κρεασ|κρεατοσ|κρεατα|κρεατων|περασ|περατοσ|περατα|περατων|τερασ|τερατοσ|τερατα|τερατων|φωσ|φωτοσ|φωτα|φωτων|καθεστωσ|καθεστωτοσ|καθεστωτα|καθεστωτων|γεγονοσ|γεγονοτοσ|γεγονοτα|γεγονοτων)$');
m = reg.search(s)
if m:
stem = m.group(1);
suffix = m.group(2);
s = "{0}{1}".format(stem, step1list[suffix])
print(s)
print(stem)
print(suffix)
what I get as a result is:
σογια
(with 2 blank lines after it) which means that the 2 groups are not successfully identified :(
How do I mend this?
from the docs: (also see match vs search)
import re
p = re.compile( regex )
m = p.search( 'string goes here' ) #p.match() to find from start of string only
if m:
print 'Match found: ', m.group() # group(1...n) for capture groups
else:
print 'No match'
In a previous Using R, how to reference variable variables (or variables variable) a la PHP[post]
I asked a question about something in R analagous to PHP $$ function:
Using R stats, I want to access a variable variable scenario similar to PHP double-dollar-sign technique: http://php.net/manual/en/language.variables.variable.php
Specifically, I am looking for a function in R that is equivalent to $$ in PHP.
The get( response works for strings (characters).
lapply is a way to loop over lists
Or I can loop over and get the values ...
for(name in names(vars))
{
val = vars[[name]];
I still haven't had the $$ function in R answered, although the lapply solved what I needed in the moment.
`$$` <- function
that allows any variable type to be evaluated. That is still the question.
UPDATES
> mlist = list('four'="score", 'seven'="years");
> str = 'mlist$four'
> mlist
$four
[1] "score"
$seven
[1] "years"
> str
[1] "mlist$four"
> get(str)
Error in get(str) : object 'mlist$four' not found
> mlist$four
[1] "score"
Or how about attributes for an object such as mobj#index
UPDATES #2
So let's put specific context on the need. I was hacking the texreg package to build a custom latex output of 24 models of regression for a research paper. I am using plm fixed effects, and the default output of texreg uses dcolumns to center, which I don't like (I prefer r#{}l, so I wanted to write my own template. The purpose for me, to code this, is for me to write extensible code that I can use again and again. I can rebuild my 24 tables across 4 pages in seconds, so if the data change, or if I want to tweak the function, I immediately have a nice answer. The power of abstraction.
As I hacked this, I wanted to get more than the number of observations, but also the number of groups, which can be any user defined index. In my case it is "country" (wait for it, hence, the need for variable variables).
If I do a lookup of the structure, what I want is right there: model$model#index$country which would be nice to simply call as $$('model$model#index$country'); where I can easily build the string using paste. Nope, this is my workaround.
getIndexCount = function(model,key="country")
{
myA = attr(summary(model)$model,"index");
for(i in 1:length(colnames(myA)))
{
if(colnames(myA)[i] == key) {idx = i; break;}
}
if(!is.na(idx))
{
length(unique(myA[,idx]));
} else {
FALSE;
}
}
UPDATES #3
Using R, on the command line, I can type in a string and it gets evaluated. Why can't that internal function be directly accessed, and the element captured that then gets printed to the screen?
There is no equivalent function in R. get() works for all types, not just strings.
Here is what I came up with, after chatting with the R-bug group, and getting some ideas from them. KUDOS!
`$$` <- function(str)
{
E = unlist( strsplit(as.character(str),"[#]") );
k = length(E);
if(k==1)
{
eval(parse(text=str));
} else {
# k = 2
nstr = paste("attributes(",E[1],")",sep="");
nstr = paste(nstr,'$',E[2],sep="");
if(k>2) {
for(i in 3:k)
{
nstr = paste("attributes(",nstr,")",sep="");
nstr = paste(nstr,'$',E[i],sep="");
}
}
`$$`(nstr);
}
}
Below are some example use cases, where I can directly access what the str(obj) is providing... Extending the utility of the '$' operator by also allowing '#' for attributes.
model = list("four" = "score", "seven"="years");
str = 'model$four';
result = `$$`(str);
print(result);
matrix = matrix(rnorm(1000), ncol=25);
str='matrix[1:5,8:10]';
result = `$$`(str);
print(result);
## Annette Dobson (1990) "An Introduction to Generalized Linear Models".
## Page 9: Plant Weight Data.
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14);
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69);
group <- gl(2, 10, 20, labels = c("Ctl","Trt"));
weight <- c(ctl, trt);
lm.D9 <- lm(weight ~ group);
lm.D90 <- lm(weight ~ group - 1); # omitting intercept
myA = anova(lm.D9); myA; str(myA);
str = 'myA#heading';
result = `$$`(str);
print(result);
myS = summary(lm.D90); myS; str(myS);
str = 'myS$terms#factors';
result = `$$`(str);
print(result);
str = 'myS$terms#factors#dimnames';
result = `$$`(str);
print(result);
str = 'myS$terms#dataClasses#names';
result = `$$`(str);
print(result);
After realizing the back-tick can be a bit tedious, I chose to update the function, calling it access
access <- function(str)
{
E = unlist( strsplit(as.character(str),"[#]") );
k = length(E);
if(k==1)
{
eval(parse(text=str));
} else {
# k = 2
nstr = paste("attributes(",E[1],")",sep="");
nstr = paste(nstr,'$',E[2],sep="");
if(k>2) {
for(i in 3:k)
{
nstr = paste("attributes(",nstr,")",sep="");
nstr = paste(nstr,'$',E[i],sep="");
}
}
access(nstr);
}
}
I think example will be much better than loooong description :)
Let's assume we have an array of arrays:
("Server1", "Server_1", "Main Server", "192.168.0.3")
("Server_1", "VIP Server", "Main Server")
("Server_2", "192.168.0.4")
("192.168.0.3", "192.168.0.5")
("Server_2", "Backup")
Each line contains strings which are synonyms. And as a result of processing of this array I want to get this:
("Server1", "Server_1", "Main Server", "192.168.0.3", "VIP Server", "192.168.0.5")
("Server_2", "192.168.0.4", "Backup")
So I think I need a kind of recursive algorithm. Programming language actually doesn't matter — I need only a little help with idea in general. I'm going to use php or python.
Thank you!
This problem can be reduced to a problem in graph theory where you find all groups of connected nodes in a graph.
An efficient way to solve this problem is doing a "flood fill" algorithm, which is essentially a recursive breath first search. This wikipedia entry describes the flood fill algorithm and how it applies to solving the problem of finding connected regions of a graph.
To see how the original question can be made into a question on graphs: make each entry (e.g. "Server1", "Server_1", etc.) a node on a graph. Connect nodes with edges if and only if they are synonyms. A matrix data structure is particularly appropriate for keeping track of the edges, provided you have enough memory. Otherwise a sparse data structure like a map will work, especially since the number of synonyms will likely be limited.
Server1 is Node #0
Server_1 is Node #1
Server_2 is Node #2
Then edge[0][1] = edge[1][0] = 1, indicated that there is an edge between nodes #0 and #1 ( which means that they are synonyms ). While edge[0][2] = edge[2][0] = 0, indicating that Server1 and Server_2 are not synonyms.
Complexity Analysis
Creating this data structure is pretty efficient because a single linear pass with a lookup of the mapping of strings to node numbers is enough to crate it. If you store the mapping of strings to node numbers in a dictionary then this would be a O(n log n) step.
Doing the flood fill is O(n), you only visit each node in the graph once. So, the algorithm in all is O(n log n).
Introduce integer marking, which indicates synonym groups. On start one marks all words with different marks from 1 to N.
Then search trough your collection and if you find two words with indexes i and j are synonym, then remark all of words with marking i and j with lesser number of both. After N iteration you get all groups of synonyms.
It is some dirty and not throughly efficient solution, I believe one can get more performance with union-find structures.
Edit: This probably is NOT the most efficient way of solving your problem. If you are interested in max performance (e.g., if you have millions of values), you might be interested in writing more complex algorithm.
PHP, seems to be working (at least with data from given example):
$data = array(
array("Server1", "Server_1", "Main Server", "192.168.0.3"),
array("Server_1", "VIP Server", "Main Server"),
array("Server_2", "192.168.0.4"),
array("192.168.0.3", "192.168.0.5"),
array("Server_2", "Backup"),
);
do {
$foundSynonyms = false;
foreach ( $data as $firstKey => $firstValue ) {
foreach ( $data as $secondKey => $secondValue ) {
if ( $firstKey === $secondKey ) {
continue;
}
if ( array_intersect($firstValue, $secondValue) ) {
$data[$firstKey] = array_unique(array_merge($firstValue, $secondValue));
unset($data[$secondKey]);
$foundSynonyms = true;
break 2; // outer foreach
}
}
}
} while ( $foundSynonyms );
print_r($data);
Output:
Array
(
[0] => Array
(
[0] => Server1
[1] => Server_1
[2] => Main Server
[3] => 192.168.0.3
[4] => VIP Server
[6] => 192.168.0.5
)
[2] => Array
(
[0] => Server_2
[1] => 192.168.0.4
[3] => Backup
)
)
This would yield lower complexity then the PHP example (Python 3):
a = [set(("Server1", "Server_1", "Main Server", "192.168.0.3")),
set(("Server_1", "VIP Server", "Main Server")),
set(("Server_2", "192.168.0.4")),
set(("192.168.0.3", "192.168.0.5")),
set(("Server_2", "Backup"))]
b = {}
c = set()
for s in a:
full_s = s.copy()
for d in s:
if b.get(d):
full_s.update(b[d])
for d in full_s:
b[d] = full_s
c.add(frozenset(full_s))
for k,v in b.items():
fsv = frozenset(v)
if fsv in c:
print(list(fsv))
c.remove(fsv)
I was looking for a solution in python, so I came up with this solution. If you are willing to use python data structures like sets
you can use this solution too. "It's so simple a cave man can use it."
Simply this is the logic behind it.
foreach set_of_values in value_collection:
alreadyInSynonymSet = false
foreach synonym_set in synonym_collection:
if set_of_values in synonym_set:
alreadyInSynonymSet = true
synonym_set = synonym_set.union(set_of_values)
if not alreadyInSynonymSet:
synonym_collection.append(set(set_of_values))
vals = (
("Server1", "Server_1", "Main Server", "192.168.0.3"),
("Server_1", "VIP Server", "Main Server"),
("Server_2", "192.168.0.4"),
("192.168.0.3", "192.168.0.5"),
("Server_2", "Backup"),
)
value_sets = (set(value_tup) for value_tup in vals)
synonym_collection = []
for value_set in value_sets:
isConnected = False # If connected to a term in the graph
print(f'\nCurrent Value Set: {value_set}')
for synonyms in synonym_collection:
# IF two sets are disjoint, they don't have common elements
if not set(synonyms).isdisjoint(value_set):
isConnected = True
synonyms |= value_set # Appending elements of new value_set to synonymous set
break
# If it's not related to any other term, create a new set
if not isConnected:
print ('Value set not in graph, adding to graph...')
synonym_collection.append(value_set)
print('\nDone, Completed Graphing Synonyms')
print(synonym_collection)
This will have a result of
Current Value Set: {'Server1', 'Main Server', '192.168.0.3', 'Server_1'}
Value set not in graph, adding to graph...
Current Value Set: {'VIP Server', 'Main Server', 'Server_1'}
Current Value Set: {'192.168.0.4', 'Server_2'}
Value set not in graph, adding to graph...
Current Value Set: {'192.168.0.3', '192.168.0.5'}
Current Value Set: {'Server_2', 'Backup'}
Done, Completed Graphing Synonyms
[{'VIP Server', 'Main Server', '192.168.0.3', '192.168.0.5', 'Server1', 'Server_1'}, {'192.168.0.4', 'Server_2', 'Backup'}]
Is there any available solution for (re-)generating PHP code from the Parser Tokens returned by token_get_all? Other solutions for generating PHP code are welcome as well, preferably with the associated lexer/parser (if any).
From my comment:
Does anyone see a potential problem,
if I simply write a large switch
statement to convert tokens back to
their string representations (i.e.
T_DO to 'do'), map that over the
tokens, join with spaces, and look for
some sort of PHP code pretty-printing
solution?
After some looking, I found a PHP homemade solution in this question, that actually uses the PHP Tokenizer interface, as well as some PHP code formatting tools which are more configurable (but would require the solution as described above).
These could be used to quickly realize a solution. I'll post back here when I find some time to cook this up.
Solution with PHP_Beautifier
This is the quick solution I cooked up, I'll leave it here as part of the question. Note that it requires you to break open the PHP_Beautifier class, by changing everything (probably not everything, but this is easier) that is private to protected, to allow you to actually use the internal workings of PHP_Beautifier (otherwise it was impossible to reuse the functionality of PHP_Beautifier without reimplementing half their code).
An example usage of the class would be:
file: main.php
<?php
// read some PHP code (the file itself will do)
$phpCode = file_get_contents(__FILE__);
// create a new instance of PHP2PHP
$php2php = new PHP2PHP();
// tokenize the code (forwards to token_get_all)
$phpCode = $php2php->php2token($phpCode);
// print the tokens, in some way
echo join(' ', array_map(function($token) {
return (is_array($token))
? ($token[0] === T_WHITESPACE)
? ($token[1] === "\n")
? "\n"
: ''
: token_name($token[0])
: $token;
}, $phpCode));
// transform the tokens back into legible PHP code
$phpCode = $php2php->token2php($phpCode);
?>
As PHP2PHP extends PHP_Beautifier, it allows for the same fine-tuning under the same API that PHP_Beautifier uses. The class itself is:
file: PHP2PHP.php
class PHP2PHP extends PHP_Beautifier {
function php2token($phpCode) {
return token_get_all($phpCode);
}
function token2php(array $phpToken) {
// prepare properties
$this->resetProperties();
$this->aTokens = $phpToken;
$iTotal = count($this->aTokens);
$iPrevAssoc = false;
// send a signal to the filter, announcing the init of the processing of a file
foreach($this->aFilters as $oFilter)
$oFilter->preProcess();
for ($this->iCount = 0;
$this->iCount < $iTotal;
$this->iCount++) {
$aCurrentToken = $this->aTokens[$this->iCount];
if (is_string($aCurrentToken))
$aCurrentToken = array(
0 => $aCurrentToken,
1 => $aCurrentToken
);
// ArrayNested->off();
$sTextLog = PHP_Beautifier_Common::wsToString($aCurrentToken[1]);
// ArrayNested->on();
$sTokenName = (is_numeric($aCurrentToken[0])) ? token_name($aCurrentToken[0]) : '';
$this->oLog->log("Token:" . $sTokenName . "[" . $sTextLog . "]", PEAR_LOG_DEBUG);
$this->controlToken($aCurrentToken);
$iFirstOut = count($this->aOut); //5
$bError = false;
$this->aCurrentToken = $aCurrentToken;
if ($this->bBeautify) {
foreach($this->aFilters as $oFilter) {
$bError = true;
if ($oFilter->handleToken($this->aCurrentToken) !== FALSE) {
$this->oLog->log('Filter:' . $oFilter->getName() , PEAR_LOG_DEBUG);
$bError = false;
break;
}
}
} else {
$this->add($aCurrentToken[1]);
}
$this->controlTokenPost($aCurrentToken);
$iLastOut = count($this->aOut);
// set the assoc
if (($iLastOut-$iFirstOut) > 0) {
$this->aAssocs[$this->iCount] = array(
'offset' => $iFirstOut
);
if ($iPrevAssoc !== FALSE)
$this->aAssocs[$iPrevAssoc]['length'] = $iFirstOut-$this->aAssocs[$iPrevAssoc]['offset'];
$iPrevAssoc = $this->iCount;
}
if ($bError)
throw new Exception("Can'process token: " . var_dump($aCurrentToken));
} // ~for
// generate the last assoc
if (count($this->aOut) == 0)
throw new Exception("Nothing on output!");
$this->aAssocs[$iPrevAssoc]['length'] = (count($this->aOut) -1) - $this->aAssocs[$iPrevAssoc]['offset'];
// post-processing
foreach($this->aFilters as $oFilter)
$oFilter->postProcess();
return $this->get();
}
}
?>
In the category of "other solutions", you could try PHP Parser.
The parser turns PHP source code into an abstract syntax tree....Additionally, you can convert a syntax tree back to PHP code.
If I'm not mistaken http://pear.php.net/package/PHP_Beautifier uses token_get_all() and then rewrites the stream. It uses heaps of methods like t_else and t_close_brace to output each token. Maybe you can hijack this for simplicity.
See our PHP Front End. It is a full PHP parser, automatically building ASTs, and a matching prettyprinter that regenerates compilable PHP code complete with the original commments. (EDIT 12/2011:
See this SO answer for more details on what it takes to prettyprint from ASTs, which are just an organized version of the tokens: https://stackoverflow.com/a/5834775/120163)
The front end is built on top of our DMS Software Reengineering Toolkit, enabling the analysis and transformation of PHP ASTs (and then via the prettyprinter code).
I find print_r in PHP extremely useful, but wonder if there is anything remotely equivalent in Perl?
Note #tchrist recommends Data::Dump over Data::Dumper. I wasn't aware of it, but from the looks of it, seems like it's both far easier to use and producing better looking and easier to interpret results.
Data::Dumper :
A snippet of the examples shown in the above link.
use Data::Dumper;
package Foo;
sub new {bless {'a' => 1, 'b' => sub { return "foo" }}, $_[0]};
package Fuz; # a weird REF-REF-SCALAR object
sub new {bless \($_ = \ 'fu\'z'), $_[0]};
package main;
$foo = Foo->new;
$fuz = Fuz->new;
$boo = [ 1, [], "abcd", \*foo,
{1 => 'a', 023 => 'b', 0x45 => 'c'},
\\"p\q\'r", $foo, $fuz];
########
# simple usage
########
$bar = eval(Dumper($boo));
print($#) if $#;
print Dumper($boo), Dumper($bar); # pretty print (no array indices)
$Data::Dumper::Terse = 1; # don't output names where feasible
$Data::Dumper::Indent = 0; # turn off all pretty print
print Dumper($boo), "\n";
$Data::Dumper::Indent = 1; # mild pretty print
print Dumper($boo);
$Data::Dumper::Indent = 3; # pretty print with array indices
print Dumper($boo);
$Data::Dumper::Useqq = 1; # print strings in double quotes
print Dumper($boo);
As usually with Perl, you might prefer alternative solutions to the venerable Data::Dumper:
Data::Dump::Streamer has a terser output than Data::Dumper, and can also serialize some data better than Data::Dumper,
YAML (or Yaml::Syck, or an other YAML module) generate the data in YAML, which is quite legible.
And of course with the debugger, you can display any variable with the 'x' command. I particularly like the form 'x 2 $complex_structure' where 2 (or any number) tells the debugger to display only 2 levels of nested data.
An alternative to Data::Dumper that does not produce valid Perl code but instead a more skimmable format (same as the x command of the Perl debugger) is Dumpvalue. It also consumes a lot less memory.
As well, there is Data::Dump::Streamer, which is more accurate in various edge and corner cases than Data::Dumper is.
I use Data::Dump, it's output is a bit cleaner than Data::Dumper's (no $VAR1), it provides quick shortcuts and it also tries to DTRT, i.e. it will print to STDERR when called in void context and return the dump string when not.
I went looking for the same thing and found this lovely little Perl function, explicitly meant to generate results like print_r().
The author of the script was asking your exact question in a forum here.
print objectToString($json_data);
Gives this output:
HASH {
time => 1233173875
error => 0
node => HASH {
vid => 1011
moderate => 0
field_datestring => ARRAY {
HASH {
value => August 30, 1979
}
}
field_tagged_persons => ARRAY {
HASH {
nid => undef
}
}
...and so on...