Own syntax highlighter - php

I am about writing own simple syntax highlighter in PHP. I've done basic algorithm based on regular expressions and string replacement, but what I really don't know is way how to disable replacing keywords which are commented.
For example:
/**
* Some class
*
* #property-read $foo
*/
class Test
{
private $foo;
public function __construct()
{
}
}
Now my solution simply highlight defined keywords (like class or variables) but also those which are commented.
Any solution for this problem?

Why not use PHP's tokenizer to do the job for you? That way, your syntax highlighter will parse the PHP code the exact same way the Zend Engine does, which is probably going to give you a lot better results than a regular expression.

Why not borrow lessons from how vi or vim already does this? long back I remember for some custom tag based language we developed, we wanted syntax highlighting in VI and VIM , that is when we changed few .vi sort of configuration files where we mentioned, all the meta data like which color to what kind of tag, what are tags possible etc.
Looking more into how vi or vim or any text editor does this might be more helpful!

You could exclude the commented lines by this logic:
if line starts with /** disable highlight
if next line starts with * do nothing and check next line
if line starts with */ reenable highlight
Just a quick guess and can be defined more precise, but should work as a logic.

Related

PHP - Best way to get meta data from comments

PHP has a function called get_meta_tags which can read meta tags of HTML files. However, as far as I know there is no standard way to define meta tags for PHP files. The de facto solution seems to be to add comment to the top of the file like so:
<?php
# Author: Ood
# Description: Hello World
?>
Is there any way to read these "meta tags" with PHP similar to the way get_meta_tags works using the default PHP library? Preferably without parsing the entire file with file_get_contents followed by a regex for best performance. If not, maybe someone knows of a better solution to add meta data capabilities to PHP files. Thanks in advance!
In our project we are happy with the standard JavaDoc that was adopted by PHPDoc using the #field syntax as you might know it from any PHP function or class definition. This is pretty fine readable using the PHPDocumenter.
In our adoption we use the very first multi-line comment, ie anyting between /** and closing tag */, using the JavaDoc style to describe all the details about the current script.
So to adopt your example in our project we would have following syntax:
<?php
/**
* #author Ood
* #desc Hello World
*/
Of course you may end up with your custom function reading the beginning of the php file parsing just the very first multi-line comment to get the script description aka meta tags.

Vim add '*' character for docs on new line [duplicate]

This question already has an answer here:
How can I get Vim to not break DocBlock comments?
(1 answer)
Closed 9 years ago.
I don't even know how to search for this on google, I tried with the title of this question, but could not find anything useful.
Working with PHP on Vim, if I have something like this:
/**
* <cursor>
*/
If I press enter, I'll get:
/**
*
<cursor>
*/
What I want is this:
/**
*
* <cursor>
*/
Yes, it is just one character, but it is bugging me a little bit.
What is the easiest way to achieve this?
Edit:
My .vimrc file have these commands (and others):
" Enable syntax highlight
syntax on
" Syntax in a plugin-based way
filetype on
" Indentation in a plugin-based way
filetype indent plugin on
set fo+=or
I guess, what you need below lines in your vimrc:
filetype on
filetype indent plugin on
I'm using the configuration of spf13-vim and it works for me. I think that the plugin PHP Integration environment for Vim - PIV makes the trick because have a feature for documentation
conforming documentation blocks for your PHP code.
Well, I'm sorry for this. I found the problem:
autocmd FileType php,phtml :set ft=php.html
I used this line to load HTML snippets for PHP files with the snipMate.vim plugin. Turns out, if I remove this line, the behaviour will be what I expected to be.
Thank you everyone.

VIM: Show PHP function / class in command line?

Is there any way to show the current PHP function or class name in the VIM command line? I found a plugin for showing C function names in the status line but it does not work for PHP and in any case I prefer the command line be used to save valuable vertical lines.
Thanks.
EDIT
While looking for something completely unrelated in TagList's help I've just found these two functions:
Tlist_Get_Tagname_By_Line()
Tlist_Get_Tag_Prototype_By_Line()
Adding this in my statusbar works beautifully:
%{Tlist_Get_Tagname_By_Line()}
Also, did you read the Vim Wiki? It has a bunch of tips trying to adress the same need. There is also this (untested) plugin.
ENDEDIT
If you are short on vertical space maybe you won't mind using a bit of horizontal space?
TagList and TagBar both show a vertical list of the tags used in the current buffer (and other opened documents in TagList's case) that you can use to navigate your code.
However, I'm not particularly a fan of having all sorts of informations (list of files, VCS status, list of tags, list of buffers/tabs…) displayed at all times: being able to read the name of the function you are in is only useful when you actually need to know it, otherwise it's clutter. Vim's own [{ followed by <C-o> are enough for me.
I don't know anything about PHP, and I'm not trying to step on anyone's toes, but having looked at some PHP code I came up with this function which I think takes a simpler approach than the plugins that have been mentioned.
My assumpmtion is that PHP functions are declared using the syntax function MyFunction(){} and classes declared using class MyClass{} (possibly preceded by public). The following function searches backwards from the cursor position to find the most recently declared class or function (and sets startline). Then we search forward for the first {, and find the matching }, setting endline. If the starting cursor line is inbetween startline and endline, we return the startline text. Otherwise we return an empty string.
function! PHP_Cursor_Position()
let pos = getpos(".")
let curline = pos[1]
let win = winsaveview()
let decl = ""
let startline = search('^\s*\(public\)\=\s*\(function\|class\)\s*\w\+','cbW')
call search('{','cW')
sil exe "normal %"
let endline = line(".")
if curline >= startline && curline <= endline
let decl = getline(startline)
endif
call cursor(pos)
call winrestview(win)
return decl
endfunction
set statusline=%{PHP_Cursor_Position()}
Because it returns nothing when it is outside a function/class, it does not display erroneous code on the statusline, as the suggested plugin does.
Of course, I may well be oversimplifying the problem, in which case ignore me, but this seems like a sensible approach.

Can I use a hash sign (#) for commenting in PHP?

I have never, ever, seen a PHP file using hashes (#) for commenting. But today I realized that I actually can! I'm assuming there's a reason why everybody uses // instead though, so here I am.
Is there any reason, aside from personal preference, to use // rather than # for comments?
2021 UPDATE: As of PHP 8, the two characters are not the same. The sequence #[ is used for Attributes.(Thanks to i336 for the comment)
Original Answer:
The answer to the question Is there any difference between using "#" and "//" for single-line comments in PHP? is no.
There is no difference. By looking at the parsing part of PHP source code, both "#" and "//" are handled by the same code and therefore have the exact same behavior.
PHP's documentation describes the different possibilities of comments. See http://www.php.net/manual/en/language.basic-syntax.comments.php
But it does not say anything about differences between "//" and "#". So there should not be a technical difference. PHP uses C syntax, so I think that is the reason why most of the programmers are using the C-style comments '//'.
<?php
echo 'This is a test'; // This is a one-line C++ style comment
/* This is a multi-line comment.
Yet another line of comment. */
echo 'This is yet another test.';
echo 'One Final Test'; # This is a one-line shell-style comment
?>
RTM
Is there any reason, aside from personal preference, to use // rather than # for comments?
I think it is just a personal preference only. There is no difference between // and #. I personally use # for one-line comment, // for commenting out code and /** */ for block comment.
<?php
# This is a one-line comment
echo 'This is a test';
// echo 'This is yet another test'; // commenting code
/**
* This is a block comment
* with multi-lines
*/
echo 'One final test';
?>
One might think that the # form of commenting is primarily intended to make a shell script using the familiar "shebang" (#!) notation. In the following script, PHP should ignore the first line because it is also a comment. Example:
#!/usr/bin/php
<?php
echo "Hello PHP\n";
If you store it in an executable file you can then run it from a terminal like this
./hello
The output is
Hello PHP
However, this reasoning is incorrect, as the following counterexample shows:
#!/usr/bin/php
#A
<?php
#B
echo "Hello PHP\n";
The first line (the shebang line) is specially ignored by the interpreter. The comment line before the PHP tag is echoed to standard output because it is not inside a PHP tag. The comment after the opening PHP tag is interpreted as PHP code but it is ignored because it is a comment.
The output of the revised version is
#A
Hello PHP
If you establish some rule sets in your team / project... the 2 types of comments can be used to outline the purpose of the commented code.
For example I like to use # to mute / disable config settings, sub functions and in general a piece of code that is useful or important, but is just currently disabled.
There's no official PSR for that.
However, in all PSR example code, they use // for inline comments.
There's an PSR-2 extension proposal that aims to standardize it, but it's not official: https://github.com/php-fig-rectified/fig-rectified-standards/blob/master/PSR-2-R-coding-style-guide-additions.md#commenting-code
// is more commonly used in the PHP culture, but it's fine to use # too. I personally like it, for being shorter and saving bytes. It's personal taste and biased, there's no right answer for it, until, of course, it becomes a standard, which is something we should try to follow as much as possible.
Yes, however there are cross platform differences.
I use # all the time for commenting in PHP, but I have noticed an adoption difference.
On windows keyboard the # key is easy to use.
On mac keyboard # key mostly isn't present.
So for mac users, [Alt] + [3] or [⌥] + [3] is more difficult to type than //, so // has become a cross platform way of displaying code with comments.
This is my observation.
From https://php.net/manual/en/migration53.deprecated.php
"Deprecated features in PHP 5.3.x ...Comments starting with '#' are now deprecated in .INI files."
There you have it. Hash '#' appears to remain as a comment option by default by not being deprecated. I plan to use it to distinguish various layers of nested if/else statements and mark their closing brackets, or use to distinguish code comments from commented out code as others have suggested in related posts. (Note: Link was valid/working as of 4/23/19, although who knows if it'll still be working when you're reading this.)
Is there any reason, aside from personal preference, to use // rather
than # for comments?
I came here for the answer myself, and its good to know there is NO code difference.
However, preference-wise one could argue that you'd prefer the 'shell->perl->php' comment consistency vs the 'c->php' way.
Since I did approach php as a poor man's webby perl, I was using #.. and then I saw someone else's code and came straight to SO. ;)
OP Question: "Is there any reason, aside from personal preference, to use // rather than # for comments?"
One 2021 Answer, which is certainly not the only answer as we see in this thread:
If you're using Visual Studio Code and using regions to block your code, then you must use # rather than // to define the region. To the question, No, even for this use case : If you are commenting out a region, you can use # or // or /** */, the technique you use for this is personal preference.
Examples for block definition in VSCode :
#region this is a major block
/** DocBlock */
function one() {}
/** DocBlock */
function two() {
#region nested region based on indentation
// comments and code in here
# another nested region based on indentation
// foo
#endregion
#endregion
}
#endregion
On Fold of the inner block:
#region this is a major block
/** DocBlock */
function one() {}
/** DocBlock */
function two() {
> #region nested region based on indentation
}
#endregion
On Fold of the outer block:
> #region this is a major block
I cite the following specific usage which one might be tempted to try, but these do not work. In fact this is exactly how you DISable a #region block:
// #region
// #endregion
/** #region */
/** #endregion */
As to commenting out a region in VSCode:
/** You can now collapse this block
#region Test1
// foo
#endregion
// everything through to here is collapsed
*/
// #region Test1
// folding is disabled here
// #endregion
# #region Test1
// this also disables the fold
# #endregion
All of that said, "Is there any reason, aside from personal preference, to use // rather than # for comments?" I agree with comments in this thread and in the other thread: // is more commonly recognized and used, which is usually a good reason to use that comment style over #.
Final note, be careful about nesting based on indentation, as code formatting can remove your manual indentation and thus ruin your scheme of nested blocks based on comments. I've tested this with both # and // (which BTW, // nests on indentation too. Again, in context with the OP question, No, there is no reason to use // over # for nested indentation in this context in the current VSCode because both work exactly the same. However, this is a use case for using # over //.
Ref - no extension required, verified in 1.62.3. See notes on indentation there as well.
Comments with "#" are deprecated with PHP 5.3. So always use // or /.../

Parse function names from PHP source using PHP regex?

I'm using ctags on linux to create tags for source code using vim and the Tlist plug-in. The current ctags function parsing for PHP is woeful so i've downloaded the source for ctags and im going to change the regex that parses the functions.
Because i'm dealing with lots of code that has functions declared in many different ways i need a regex to reliably parse the function names properly.
Do you have one you could share that parses a php function name from a line of source code?
This is the current patched and 'improved' one from the ctags source which misses many functions, especially those marked as final or static first.
(^[ \t]*)(public[ \t]+|protected[ \t]+|private[ \t]+)?(static[ \t]+)?function[ \t]+&?[ \t]*([" ALPHA "_][" ALNUM "_]*)
Would just adding static and final to the possible list of words to ignore, and making it match more then one of the keywords be close enough?
(^[ \t]*)((public|protected|private|static|final)[ \t]*)*function[ \t]+&?[ \t]*([" ALPHA "_][" ALNUM "_]*)
Would mean it would accept junk like 'public public static final function bogus()', but php's syntax checking will reject it, and therefore shouldn't be a significant issue.
s/^.*\sfunction\s([^\(]*)\(.*$/\1/i
i tested it with some sed
grep -Ri function *| head -10 | sed 's/^.*\sfunction\s\([^\(]*\)(.*$/\1/i'

Categories