PHP reflection; extracting non-block comments - php

I've recently become familiar with Reflection, and have been experimenting with it, especially getDocComment(), however it appears that it only supports /** */ comment blocks.
/** foobar */
class MyClass{}
$refl = new ReflectionClass('MyClass');
// produces /** foobar */
echo $refl->getDocComment();
-Versus-
# foobar
class MyClass{}
$refl = new ReflectionClass('MyClass');
// produces nothing
echo $refl->getDocComment();
Is it not possible to capture this without resorting to any sort of file_get_contents(__FILE__) nonsense?
As per dader51's answer, I suppose my best approach would be something along these lines:
// random comment
#[annotation]
/**
* another comment with a # hash
*/
#[another annotation]
$annotations
= array_filter(token_get_all(file_get_contents(__FILE__)), function(&$token){
return (($token[0] == T_COMMENT) && ($token = strstr($token[1], '#')));
});
print_r($annotations);
Outputs:
Array
(
[4] => #[annotation]
[8] => #[another annotation]
)

DocComments distinguish themselves by saying something about how your classes are to be used, compared to regular comments that could assist a developer in reading the code. That's also why the method isn't called getComment() instead.
Of course it's all text parsing, and someone just made a choice in docComments always being these multiline comments, but that choice has apparently been made, and reading regular comments is not something in the reflection category.

I was trying to do just a you a few days ago, and here is my trick. You can just use the php internal tokenizer ( http://www.php.net/manual/en/function.token-get-all.php ) , and then walk the array returned to select only the comments, here is a sample code :
$a = token_get_all(file_get_contents('path/to/your/file.php'));
print_r($a); // display an array of all tokens found in the file file.php
Here is a list of all tokens php recognize : http://www.php.net/manual/en/tokens.php
And the comment you will get by this method include ( from php.net site ) :
T_COMMENT : // or #, and /* */ in PHP 5
Hope it helps !

AFAIK, for a comment to become documentation it is needed to start with /** not even with standard multi-line comment.

A doc comment as the name implies, is a documentation comment, not a standard comment, otherwise when you are grabbing comments for apps such as doxygen it will try to document any commented code from testing/debuggung, etc, which often gets left behind and is not important to the user of the API.

As you can read here in the first User Contributed Note:
The doc comment (T_DOC_COMMENT) must begin with a /** - that's two asterisks, not one. The comment continues until the first */. A normal multi-line comment /*...*/ (T_COMMENT) does not count as a doc comment.
So only /** */ blocks are given by this method.
I don't know any other method with php to get the other comments as using file_get_contents and filter the comments with e.g. a regex

Related

Official PHPDoc reference for documenting PHP code

I'm on my way to upgrade my projects to PHP 8.0+. Until now, in the code comments, I used the #param and #return tags like in "option 1", instead of like in "option 2":
Option 1:
#param string[] ....
#return User[] ....
Option 2:
#param array ....
#return array ....
Though, because I don't know if the first form is officially allowed, I begin to ask myself, if it wouldn't be better to switch to the second option... So, I'd like to ask you: Is there any official PHPDoc reference for documenting PHP codes available?
Also, is it at all advisable to use the first option instead of the second one? In other words: are there any arguments speaking against it - having the future in mind too?
Thank you for your time.
P.S: I found the reference of PHPDocumentor, but I have the feeling, that it is not the official PHP one and not (yet) compatible with PHP 8.0+.
PHPDoc isn't a part of the official documentation but since it has been so widely adapted I highly doubt it will be ignored.
PHP itself prior to version 8 defines only comment syntax https://www.php.net/manual/en/language.basic-syntax.comments.php which does not include any # related elements.
Version 8 of PHP introduces attributes https://www.php.net/manual/en/language.attributes.overview.php which will be the native replacement for annotations.
For example https://api-platform.com/docs/core/filters/
PHP till 7.x
/**
* #ApiResource(attributes={"filters"={"offer.date_filter"}})
*/
class Offer
{
// ...
}
Since PHP 8
#[ApiResource(attributes: ['filters' => ['offer.date_filter']])]
class Offer
{
// ...
}
PSR Standard
PHP FIG defined 2 PSR standards ( Not approved yet )
PSR-5 https://github.com/php-fig/fig-standards/blob/master/proposed/phpdoc.md
PSR-19 https://github.com/php-fig/fig-standards/blob/master/proposed/phpdoc-tags.md
Though, because I don't know if the first form is officially allowed,
I begin to ask myself, if it wouldn't be better to switch to the
second option...
I will just stick with the Option 1. It is extremely beneficial for code completion standpoint.

PHP "with" keyword - what does "with" do?

Can someone please explain what PHP "with" does?
Example begins:
Say I have a class:
\App\fa_batch
What is the difference between this statement:
$w = (with (new \App\fa_batch))
// uses 'with'
// returns App\fa_batch
and this statement?
$n = (new \App\fa_batch)
// does not use 'with'
// also returns App\fa_batch
Context / Background Info:
I'm having trouble finding documentation for with, maybe because the PHP.net, stack overflow and google search engines considers php "with" keyword such a common search phrase.
If context helps, I came across this usage of the word with from this answer:
https://stackoverflow.com/a/33222754/5722034
with is not a keyword, it's a laravel function. The extra space between with and ( is a red herring.
The 5.2 docs include it in miscellaneous helpers. The source is available on github as well
https://laravel.com/docs/5.2/helpers#miscellaneous
with() is a helper function that just returns the object.
A normal use case I've seen is when you're cloning an object, it allows you to chain onto that clone:
$object = new Object();
with(clone $object)->doSomethingWithoutAffectingTheOriginal();
In the use case you've provided, there is no difference. with() is completely redundant if you've wrapped the creation of the instance in parentheses.
Thanks to the answers by various posters, I've realised it is a function in Laravel. Here is the Laravel source: Taken from vendor/laravel/framework/src/Illuminate/Support/helpers.php
if (! function_exists('with')) {
/**
* Return the given object. Useful for chaining.
*
* #param mixed $object
* #return mixed
*/
function with($object)
{
return $object;
}
}
From that terse comment there, I understand it is used for chaining, e.g. queries together. (I use the term 'understand' loosely.)
with is not a PHP keyword.
Also take a look at the language reference page.
tldr
The with function has been used to chain methods, but is no longer required. Since php 5.4 you can simply write (new MyObject)->myMethod() - this was not possible prior to v5.4:
Example: https://3v4l.org/57dc0#v5.3.29
More detailed answer
The with was a workaround to use the instance without assigning it to a variable:
$instance = new A;
A->doSomething();
Using the with function (works prior to v5.4 - example):
with(new A)->doSomething();
Since v5.4 you can simply write:
(new A)->doSomething();

Regular Expression how to make regex take second /** as starting point

Here is the $source example
/**
* These functions can be replaced via plugins. If plugins do not redefine these
* functions, then these will be used instead.
*/
if ( !function_exists('wp_set_current_user') ) :
/**
* Changes the current user by ID or name.
*
*/
function wp_set_current_user($id, $name = '') {
Attention: some don't have the function_exists line.
For my special purpose, I'm trying to parse the docblock with regular expression.
Here is the regex
$t = preg_match_all("#(/\*\*.*?\*/\nfunction\s.*?\(.*?\))\s{#mis",$source,$m);
I expect to get:
/**
* Changes the current user by ID or name.
*
*/
function wp_set_current_user($id, $name = '') {
but instead, it returns me the whole code example.
Any help would be appreciated.
I find out some people ask me my purpose, I don't think this is important here though.
I'm using geany and I find out existing wordpress code hint isn't complete.
And the docblock parsers I found that don't parse function name and function arguments.
So I try to parse them on my own.
the code hint format of geany is
wp_set_current_user|Changes the current user by ID or name.|($id, $name = '')|
However, my point of this question is how to make regex take second "/**" as starting point?
I'm sorry for my poor English that confused you all.
You can parse comment out by regexp like this (check out Regex look around tutorial):
/\*\*/(?:(?:.(?!\*\*/))*)\*\*/
Then any number of white spaces can occur:
[\s]*
What keywords can function have in php? static, virtual, final, public, private, protected correct me if I'm forgetting something.
(?:(?:static|virtual|final|public|private|protected)\s+)*
Okay, now function header and braces:
function\s+(?P<name>\w\d_+)\s*\(...\)
The ... parts get's complicated because it can contain default value which can be complicated php string ($remove_characters = '\'"\n\r '), so parsing value (string, string, number, constant):
"[^"\\\\]*(?:\\\\.[^"\\\\]*)*"
\'[^\'\\\\]*(?:\\\\.[^\'\\\\]*)*'
[\d.]+
\w+
Resulting to one large value regexp:
("[^"\\\\]*(?:\\\\.[^"\\\\]*)*"|\'[^\'\\\\]*(?:\\\\.[^\'\\\\]*)*'|[\d.]+|\w+)
And every function argument has a format $var or $var = data (of course any number of spaces + I'm omitting array $input = array()) so this is simplified var name matching:
\\$[\w_][\w\d_]*
Type matching:
([\w_]+\s+)?
So function arguments can be:
\s*([\w_]+\s+)?(\\$[\w_][\w\d_]*|\\$[\w_][\w\d_]*\s*=\s*<value>)
And complete regexp for function would look like:
function\s+(?P<name>\w\d_+)\s*\(\s*|<argument>((,<argument>)*)\)
I won't be testing those regexp for you, it's your job to do so at this point, my goal was to show you what you need if you want to do this really correctly (but feel free to edit my answer if you find a mistake).You may also use really simplified version (like just one regexp for function arguments eating everything).
If you want the easy dirty trick, use a lookahead assertion
(?<=if\ (\ !function_exists('wp_set_current_user')\ )\ :)
Appending this to your search should do the trick. (You might have to escape the single quotes.)

What does #param mean in a class?

What does the #param mean when creating a class? As far as I understand it is used to tell the script what kind of datatype the variables are and what kind of value a function returns, is that right?
For example:
/**
* #param string $some
* #param array $some2
* #return void
*/
IsnĀ“t there another way to do that, I am thinking of things like: void function() { ... } or something like that. And for variables maybe (int)$test;
#param doesn't have special meaning in PHP, it's typically used within a comment to write up documentation. The example you've provided shows just that.
If you use a documentation tool, it will scour the code for #param, #summary, and other similar values (depending on the sophistication of the tool) to automatically generate well formatted documentation for the code.
As others have mentioned the #param you are referring to is part of a comment and not syntactically significant. It is simply there to provide hints to a documentation generator. You can find more information on the supported directives and syntax by looking at the PHPDoc project.
Speaking to the second part of your question... As far as I know, there is not a way to specify the return type in PHP. You also can't force the parameters to be of a specific primitive type (string, integer, etc).
You can, however, required that the parameters be either an array, an object, or a specific type of object as of PHP 5.1 or so.
function importArray(array $myArray)
{
}
PHP will throw an error if you try and call this method with anything besides an array.
class MyClass
{
}
function doStuff(MyClass $o)
{
}
If you attempt to call doStuff with an object of any type except MyClass, PHP will again throw an error.
I know this is old but just for reference, the answer to the second part of the question is now, PHP7.
// this function accepts and returns an int
function age(int $age): int{
return 18;
}
PHP is entirely oblivious to comments. It is used to describe the parameters the method takes only for making the code more readable and understandable.
Additionally, good code practice dedicates to use certain names (such as #param), for documentation generators.
Some IDEs will include the #param and other information in the tooltip when using a hovering over the relevant method.
#param is a part of the description comment that tells you what the input parameter type is. It has nothing to do with code syntax. Use a color supported editor like Notepad++ to easily see whats code and whats comments.

Declaring lots of variables for phpdoc without starting each with /**

I have objects with many variables that I declare and explain in the comments. I am commenting very thoroughly for later processing using phpDoc, however I have no experience with actually compiling the documentation yet.
I find it very annoying that with phpDoc notation, each variable eats up four to six lines of code even if the only attribute I want to set is the description:
/**
* #desc this is the description
*/
var $variable = null;
I would like to use the following notation:
# #desc this is the description
var $variable = null;
is there a simple way to tweak phpDoc into accepting this, or will it give me trouble when I actually try to compile documentation out of it? I don't need the tweak now (although it's appreciated of course), just a statement from somebody who knows phpDoc whether this is feasible without having to re-engineer large parts of its code.
Just write one-line docblocks
/** #desc this is the description */
var $variable = null;
Problem solved.
In addition to what Frank Farmer mentioned (+1 to his solution),
/** is declared as T_DOC_COMMENT in the PHP tokenizer since PHP 5. This means to say that documentation notation are all parsed from /** to */.
You can't just use # or /* to write your PHP documentations.
See:
http://www.php.net/manual/en/tokens.php

Categories