Understanding large php code, what techniques to use? - php

I have been handed over a large undocumented code of a application written in php as the original coder went AWOL. My task is to add new features but I can't do that without understanding the code.I started poking around. honestly, I am overwhelmed by the amount of source code. I have found:
Its well written based upon MVC architecture, DB persistence, Templating & OOP
modular, there is concept of URL based routing,basic templating
Uses custom written php framework which has no documentation.And there no source control history(oops!)
there over 500 files, with each file containing hundreds of line of code. And every file has 3-4 require_once statements which include tons of other files, so its kinda hard to tell which function/class/method is coming from where
Now I am looking for some techniques that I use to understand this code. for example, consider the following code snippet:
class SiteController extends Common {
private $shared;
private $view;
protected function init(){
$this->loadShared();
$this->loadView();
}
private function loadShared(){
$this->shared = new Home();
}
private function loadView(){
$this->view = new HomeView();
}
I want to know
where HomeView() & Home() are defined? Where does $this->shared & this->view come from? I checked the rest of the file, there is no method named shared or view. so obviously, they coming from one of hundreds of classes being included using require_once() But which one? how can I find out?
Can I get a list of all the functions or methods that are being executed? If yes, then how?
this class SiteController overrides a base Common class. But I unable to find out where is this Common class is located. How to tell?
Further, Please share some techniques that that be used to understand existing code written in php?

First, in this kind of situation, I try to get an overview of the application : some kind of global idea of :
What the application (not the code !) does
How the code is globally organized : where are the models, the templates, the controllers, ...
How each type of component is structured -- once you know how a Model class works, others will typically work the same way.
Once you have that global idea, a possibility to start understanding how the code works, if you have some time before you, is to use a PHP Debugger.
About that, Xdebug + Eclipse PDT is a possibility -- but pretty much all modern IDEs support that.
It'll allow you to go through the generation of a page step by step, line by line, understanding what is called, when, from where, ...
Of course, you will not do that for the whole application !
But as your application uses a Framework, there are high chances that all parts of the application work kind of the same way -- which means that really understanding one component should help understanding the other more easily.
As a couple of tools to understand what calls what and how and where, you might want to take a look at :
The inclued extension (quoting) : Allows you trace through and dump the hierarchy of file inclusions and class inheritance at runtime
Xdebug + KCacheGrind will allow you to generate call-graphs ; XHProf should do the same kind of thing.
Using your IDE (Eclipse PDT, Zend Studio, phpStorm, netbeans, ...), ctrl+click on a class/method should bring you to its declaration.
Also note that an application is not only code : it often find very useful to reverse-engineer the database, to generate a diagram of all tables.
If you are lucky, there are foreign keys in your database -- and you'll have links between tables, this way ; which will help you understand how they relate to each other.

You need an IDE. I use netbeans for PHP and it works great. This will allow you to find out where the homeview/home classes are by right clicking and selecting a "find where defined" option or something similar.
You can get a list. This is called the stack. Setting up a debugger like xdebug with the IDE will allow you to do this.

grep is the only thing makes me survive such codez

Look inside of the script where you found this code snippet for additional included or required pages that PHP imported into the main script. Those scripts should define those classes that are being instantiated.
Sorry, not sure if you can find which functions/methods have been executed. I know you can find if they exist, and you can find the generated output of them... but not sure if they have been executed.
It is important to note that SiteController doesn't override, the Common class, but it extends, or builds on top of it, like how a building is built on a foundation. The Common class is the foundation. Again, check the included and required scripts to see where Common was defines.
Hope that helps,
spryno724

I would start with:
throwing exception at certain points to see a stacktrace where the call originated.
grep for Class Common for example
create a directory listing to get a feeling for the organization of the software
use get_included_files(); to see what is actually used for a certain call
Start documenting what I find out
Start working with an IDE, like NetBeans, Eclipse or Zend Studio
Figuring out class hierarchies with maybe this "php: determining class hierarchy of an object at runtime" approach

You seem to realize that you can't read/digest every file, so you've got to focus on the important ones. Looks like you've started that process with SiteController.
Hopefully between reading the requires and using your IDE you can chase down the Home() and HomeView()
There might be a few key XML files that dictate the mappings from URLs to controller files, so you'll want to figure out how they work also.
I've worked with a poorly documented (but decently working) custom framework before, and your situation seems pretty similar. I found things pretty smooth once I understood the main controller and basically formed an understanding for how URL requests were processed.

1) You can use a search tool such as grep to find code, including definitions. But on a big code base, grep is slow, and it gives a lot of false positives because it has no understanding of the PHP language.
Our Search Engine is a GUI-based tool that indexes your source code to achieve extremely fast lookup, indexing by the langauge elements (variable names, constants, keywords, strings, ..) and allowing to formulate queries that honor the langauge structure (e.g., it ignores whitespace and comments unless you say you want to see them). A query shows hits in a hit window, and a click takes you to the file/line in which the hit occurs. With some tiny bit of additional configuration, you can go from the code window into your favorite editor.
2) Sometimes you want to know where specific functionality exists, but you have no clue what to search for. Here a test coverage tool can really help. Simple set up test coverage for the (working) application, and exercise the functionality manually; what is "covered" is potentially the code you care about. Exercise something which is NOT the feature; what is covered is NOT the code you want. This is way easier than trying to run a debugger to find the code of interest. Our PHP Test Coverage tool can provide you this coverage, and not only show you the covered code in GUI, but also do that "coverage subtraction" so that you can see just the relevant code.

Start from the entry point of the application (usually index.php) and go deeper on what gets called when.
Give PHPstorm a go, it's an ide with excellent code analyzing features, can go to definition of any class and variable, show inheritance hierarchy, find usages and many other useful stuff.
I'll also plug my own tool:
http://raveren.github.io/kint/
It's works with zero set up and is extremely useful to get a grip on what's going on where. Use Kint::trace(); to see a pretty execution backtrace and d(get_defined_vars()); to see what is defined in the current context and eventually you'll get there.
Screenshot:
(source: github.io)

Related

Code Hinting custom functions/objects/constants, and on chaining, commentary in Adobe Dreamweaver CS5

In Dreamweaver CS5 there's something called Code Hinting (let's call it CH for short).
CH has a bunch of information about functions, constants and objects built in the core library.
When you press CTRL+SPACEBAR or begin structuring a statement starting with $,
a window with lots of information pops up, giving me the information about it without having to look it up myself. If I press ENTER while the CH is up and something is selected, it will automatically fill in the rest for me.
I love this feature, I really do. Reminds me a little of Intellisense.
It saves me lots of time.
The issues I face, and haven't found any solutions to, are straightforward.
Issue #1 Chained methods do not display a code hint
Since PHP implemented the Classes and Objects, I've been able to chain my methods within classes/objects. Chaining is actually easy, by returning $this (the instance of that class), you can have a continuous chain of calls
class Object_Factory{
public function foo(){
echo "foo";
return $this;
}
public function bar(){
echo "bar";
return $this;
}
}
$objf = new Object_Factory;
//chaining
$objf->foo()
->bar();
Calling them separately shows the CH.
$objf->foo();
$objf->bar();
The problem is that after the first method has been called and I try to chain another method, there's no CH to display the next calls information.
So, here's my first question:
Is there a way, in Dreamweaver CS5, to make the code hints appear on chaining?
Plugins, some settings I haven't found, anything?
if("no") "Could you explain why?";
Issue #2 Code hinting for custom functions, objects and constants
As shown in the first picture, there's a lot of information popping up. In fact, there's a document just like it on the online library. Constants usually have a very small piece of information, such as a number.
In this image, MYSQL_BOTH represents 3.
Here's my second question:
Is it possible to get some information to the CH window for custom functions, objects and constants?
For example, with Intellisense you can use a setup with HTML tags and three slashes ///
///<summary>
///This is test function
///</summary>
public void TestFunction(){
//Do something...
}
Can something similar be done here?
Changing some settings, a plugin, anything?
Update
I thought I'd found something that might be the answer to at least issue #1, but it costs money, and I'm not going to pay for anything until I know it actually does what I want.
Has anyone tried it, or know it won't solve any of the issues?
The search continues...
In case none of these are possible to fix, here's hoping one of the developers notices this question and implements it in an update/new release.
I just switched to NetBeans after 10 years of using Dreamweaver. My impressions may help you. (I'll call them NB and DW respectively from now on)
Code Hints / Documentation
PHP built-in functions
Both DW and NB show all of the built-in PHP functions and constants. A nice feature is that they also provide with a link that opens the related PHP documentation page.
DW is much slower to update the definitions (through sporadic Adobe updates or on the next release) and updating them doesn't look easy (on the other hand, I quickly found the .zip files that NB uses for the PHP/HTML/CSS reference, in case I wanted to manually edit/update them).
However, since documentation can be opened so easily, I do not consider this to be a problem.
Custom functions/classes
This is where NB is clearly better; it instantly learns from your project's code. Hints for function parameters are smart in many cases, suggesting the most likely variable first.
Method chaining works wonderfully, as seen here:
(This would address question #1)
PHPDoc Support
I was greatly impressed with this feature. Take for example the above screenshot. I just typed /** followed by Enter and NB automatically completed the comment with the return type hint (also function parameters if present).
<?php
/**
*
* #return \Object_Factory
*/
public function foo(){
echo "foo";
return $this;
}
?>
Another example:
(This would address question #2)
You can include HTML code as well as some special # tags to your PHPDoc comments to include external links, references, examples, etc.
Debugging tools
Also noteworthy IMHO are the debugging tools included with NB. You can trace all variables (also superglobals!) while you advance step-by-step.
Configuring xDebug is very easy, just uncomment some lines in your php.ini and that's it!
Other stuff
The refactoring (i.e. renaming or safely deleting functions/variables) in NB is really nice. It gives you a very graphically detailed preview of the changes before committing them.
However, the search/replace functions of DW are vastly better. I miss a lot the "search for specific tag with attribute..." function. NB only provides a RegEx search/replace.
NB has a nice color chooser but it almost never suggests it; I thought for a while there wasn't one until I accidentally discovered it. Now I know how to invoke it (CTRL+SPACE, start typing Color chooser and Enter). Very cumbersome, indeed.
I haven't used FTP a lot since I moved to NB, but I have the feeling that DW was also much better, specially for syncing local/remote folders.
NB has really good native support for SVN, Mercurial and Git. When you activate versioning support, you can see every change next to the line number (the green part on my screenshots means those lines are new). I can click on a block and compare/revert those changes, see who originally committed every line (and when), etc.
Even when [team] versioning is deactivated, NB has a built-in local history that helps you recover previous versions as well as deleted files.
Conclusion
Starting with Macromedia Dreamweaver and seeing how it slowly stayed behind the Internet as Adobe struggled to integrate and adapt their products is a painful process. (To this day DW still doesn't render correctly, even with LiveView. To be fair, NB doesn't have a built-in renderer)
Certainly, the Adobe-ization of DW has had its advantages, but this humble programmer was having a hard time justifying a $399 USD ~400MB IDE vs a very comparable free 49MB multi-platform IDE.
After the initial learning curve, I'm very comfortable with NetBeans and I don't think I'll be returning to Dreamweaver any time soon.
I know this doesn't directly answer your questions regarding DW, but I hope it helps anyway.
Use the Site-Specfic Code Hinting feature
Make your own structure, just add the files where your functions, classes, etc. are stored. Save the structure and your done, just worked for me!
I know it is an older question and this is not the complete answer. But it will help someone for sure.
http://tv.adobe.com/watch/learn-dreamweaver-cs5/sitespecific-code-hinting-in-dreamweaver-cs5/
"Use Dreamweaver CS5 to view code hints related to content management
system frameworks such as WordPress, Drupal, and Joomla. Learn how to
set up site-specific code hinting for a CMS so you can easily work
with your PHP website in Dreamweaver. "
for #1, The complication with a scripting language is its not strict typing. The function/method could return null, false, true, int, array, string...
So the 'intellisense' has no type to base a hint off from unless it recompiles it and checks every possible return type.
for #2, the hinting is based off a clip definition file that exists for each version of PHP. With Microsoft products the currents projects (compiled) definitions are added. With PHP there is no compiling, checking or addition to the clip database (automatically). Some like PSPad will give you CodeExplorer that list each function and class in that file, but the only means I know of to get them to show up in hinting is to add it to the cips definition. I don't know where or if its possible in dreamweaver. Zend Studio and others do custom compiling and inclusion.

Aptana Studio 3 - Can I set color for a specific function in a theme?

I am developing an application using PHP in Aptana Studio 3, and have set up a global debug() function that uses firePhp. As it is, my calls to the debug() function are scattered throughout the code. This is ok for this phase, because it is helping me a lot to catch bugs early. However, all these debug() calls scattered throughout, are making code a lot less readable.
I would like to be able to have only those debug() calls syntax-colored differently from the rest of the functions, preferably a lighter color, so that at first sight they look more like comments than actual code.
I am really confused by the TextMate approach Aptana 3 has taken, and, even though I understand that it offers many possiblities, not knowing any Ruby, has made this whole configuration thing very unapproachable to me.
In short, is there a quick way to have just this one function colored differently?
Follow up: Because there is officially no way to color-code specific functions, I resorted to renaming all debug() functions to _d(), as well as added _g(), _u(), and _t(), for start group, close group, and stack trace functionality respectively. Adding the underscores changes the "visual texture" of the code, so I can more easily focus on the lines that actually matter.
Short answer: no.
Scopes are assigned by our partitioning and tokenizing in Java code for the various languages, not in separate language syntax definitions like in Textmate.
We follow Textmate conventions for scope names and matching, but we don't currently let you alter how the scopes get assigned via rubles or anything. As a result there's no way for you to assign custom scopes to your own methods/functions/variables.

PHP - Cleanup the Junk

I have inherited a very messy project. There are at least 3 versions that I can tell in it.
Is there a utility that can trace the PHP code from the main index.php so that I can figure out what isn't being used and what is, or am I stuck doing a manual cleanup?
Thanks
*Update*
I don't think I've been clear about what I'm looking for, that or I'm not understanding how the products mentioned work. What I'm looking for is something that can run on a folder (directory) and step through the project and give me a report of which files are actually referenced or used (in the case of images, CSS, etc).
This project has several thousand files and it's a very small project. I'm trying to clean it up and when I do a "search in files" in my IDE I get 3 or 4 references and can't easily tell which one is the right one.
Hope that makes it a little clearer.
Cross referencing software really lets you explore which functions are used for what.
PHPXref is quite good..
For example Yoast used it to cross reference the Wordpress PHP code. Take a look at the Wordpress example of how powerful it is.
For example, start by browsing the WP trunk. Click on some of the file names on the left and observe how the required files are listed, along with defined classes and methods, etc., etc.
There are several utilities that can do this, what first comes mind is Zend Studio's built in Optimizer that will run through your files and issue notices on a per file basis, including unused variables, warnings, etc. Alternatively, you can run your program in E_STRICT and PHP will notify you of some of your issues.
Be very careful of such cleanup tools, especially in PHP or Javascript. They work reasonably well in languages like Java, but any language that allows Eval() can trip an automated tool up, sometimes in devilishly clever ways, depending on how clever the original code developer thought they were.
You need the inclued extension. You can generate include graphs using GraphViz, see below for example code.
There are some useful examples on PHP.net: http://www.php.net/manual/en/inclued.examples-implementation.php
You might want to check xdebug's code coverage, possibly as an auto_append. However, itÅ› rather limited and it would require you to have either 100% test-cases (which I doubt as you say the project is a mess), or the tenacity to go through every possible action on the site, and even then you'll have to apply good judgement whether you can remove a portion of code because it isn't used, or leave it there because a certain condition just hasn't been met yet in your cases. On a side note: stepping through the code with xdebug's remote debugger has really helped me in the past to quickly get the different mechanisms & flows in unknown projects.
I would try opening the whole project in NetBeans PHP, its a great tool which we use for huge projects. You can easily see warnings and notifications and also follow usage of functions/classes easily. Try it!
I would recommend against automatic cleanups and the likes. Even if the code seems to work afterwards, I wouldnt sleep very well at night...

PHP (A few questions) OO, refactoring, eclipse

I am using PHP in eclipse. It works ok, I can connect to my remote site, there is colour coding of code elements and some code hints.
I realise this may be too long to answer all questions, if you have a good answer for one part, answering just that is ok.
Firstly General Coding
I have found that it is easy to
loose track of included files and
their variables. For example if
there was a database $cursor it is
difficult to remember or even know
that it was declared in the included
file (this becomes much worse the
more files you include). How are
people dealing with this?
How are people documenting their
code - in particular the required
GET and POST data?
Secondly OO Development:
Should I be going full OO in my
development. Currently I have a
functions library which I can
include and have separated each
"task" into a separate file. It is a
bit nasty but it works.
If I go OO how do I structure the
directories in PHP, java uses
packages - what about php?
How should I name my files, should I
use all lower case with _ for spaces
"hello_world.php"? Should I name
classes with Uppercase like Java
"HelloWorld.php"? Is there a
different naming convention for
Classes and regular function files?
Thirdly Refactoring
I must say this is a real pain. If
I change the name of a variable in
one place I have to go through whole
document and each file that included
this file and change the name their
too. Of course, errors everywhere
is what results. How are people
dealing with this problem? In Java
if you change the name in one place
it changes everywhere.
Are there any plugins to improve php
refactoring? I am using the
official PHP version of Eclipse from
their website.
thanks
Firstly General Coding
1) OO can help you with that. As you encapsulate variables and functionality, they don't go out and mess with the namespaces. Assumind I understand right what problem you hint at, using an OO approach helps alleviating conflicts that can arise when you are inadvertedly redeclaring varables. (Note: Alleviate. Not completely prevent on its own. ;))
Otherwise a practise i have encounterd is prepending variable names with something like a 'package name' -- which merely shifts the problem one level up and isn't exactely beautiful either. :|
2) "However suits their purpose". PHPdoc is a good start; will help to create API documentation.
Secondly OO Development:
3) As said before -- "it depends". Do it when needed. You don't have to go full OO for "hello world". But you can. Weigh the costs and benefits of either route and choose wisely. Though I personally want to suggest when in doubt favour OOP over 'unstructured' approaches. Basically, know your tools and when to use them -- then you can make that call on your own easily. :)
4) As far as I can see, the directories "are structured like packages". Mind you, "directories" and "like". Having said that, various frameworks have solved that problem for theirselves; cf; th eother answers.
5) Again, however you please. There is not a definitive way You Have To Do It Or Else. Just stick to it once you chose your path ;3
Aside of that certain frameworks etc. have their own naming conventions. Symfony, e.g., uses CamelCase like Java.
Thirdly Refactoring
I must say this is a real pain.
yes :3 But it pays off.
If I change the name of a variable in one place I have to go through whole
document and each file that included this file and change the name their too.
Of course, errors everywhere is what results. How are people dealing with
this problem? In Java if you change the name in one place it changes everywhere.
No, it doesn't. If you get yourself a tool with support you only have to use the refactoring tool once; but if you rename a class property in java, there is no magic bot that walks through the internet and automagically makes sure everyone on the planet uses the new name. ;)
But as for how to prevent it -- be smart. Honour program contracts, i.e. use interfaces. Do not use functions / members you shouldn't use directly. Watch the hierarchies. Use a reasonable division of code and respect this division's boundaries.
But how people deal with that problem? Well, search and replace I suppose ;)
As for the Eclipse-Plugin -- The dynamic nature of PHP makes it more difficult to automagically refactor code; we can't always use static type hinting etc., and divination of argument and return types is impossible more often than not. So, to the extent of my knowledge, 'automatic refactoring' is not as well-supported by tools as in the Java world. Though I am sure for the doable cases, there should be plugins. :)
I've found using a PHP framework (e.g. Zend, Cake, CodeIgniter, etc) can force class structures and naming conventions while generally addressing autoloading as well. Using PHPDoc formatting liberally helps with code-completion and hinting as well as documenting specific requirements (e.g. method parameter definitions).
For the OO Development part:
I am using the autoload functionality to load the classes dynamically. My directory structure is like packages in java. My classes are named like in java (e.g. HelloWorld.php). But the class is defined with the complete path to that class (e.g. class FW_package1_package2_HelloWorld {...}).
If a class is called the autoload method replaces all _ against / and searches for the class with the extracted path (e.g. FW/package1/package2/HelloWorld.php).
I am strongly influenced by Java, so that I chose this way.
Take a look at nWire for PHP. It is a plugin for Eclipse PDT which provides code exploration and visualization.
It can easily be used to trace dependencies within your application and it is very handy for OO projects, enabling you to visualize class hierarchies and much more.
It doesn't support refactoring, but it can assist by showing you the references of a given components (e.g. a function or a field).

parse/collator for php

I'm pretty much a newbie at php (at the "install an app and try to tweak it a bit" stage).
Is there a tool anywhere which can take a script which is spread over many files and show you all the code which is processed (for a given set of arguments passed to the script) in a single output?
For example, I want to make a call to zen cart from a script in a different language, which returns a category listing without any surrounding page. So I want to be able to trace what the actual process is to generate that then strip off all the unwanted bits to create a custom script.
One thing I've found very helpful when looking at new / complicated codebases is to use an IDE with some sort of code intelligence. I use php eclipse, and what it does is allow you to jump into function and variable definitions either by means of hyperlinking, or popups. This can be incredibly helpful for navigating through sprawling projects because you don't have to go through all the trouble to search by hand.
In your case, with php, the best thing to do is find the entry point for a page that pulls in your list of categories. Once you find that, you can use eclipse to expand out the various function calls it makes. Being a beginner, it's very helpful to read through code in this manner, as it exposes you to lots of different ways of doing things. An additional bonus of using something like eclipse is that it provides integration with the PHP manual. So anytime you encounter a function you don't know, you can hover over, see the manual, and also how it would be used in context.
What you want is called a "backwards slice" ("all the code that contributes to a specific computed result") in the computing theory literature. To compute the backward slice, something needs to parse the langauge, compute all the influences (control and dataflow) on a selected point in the program, and then display those points to you.
Slicing tools exist for langauges like C. They may exist for Java (as academic versions). I don't know of any that exist for PHP.
Another way to discover the code involved in an action is to run a test coverage tool. Such a tool marks all the code (across many files) that gets executed for a specific action (usually a "unit test" but test coverage tools really don't care). Then you simply exercise the action you care about, and look at the test coverage data. A graphical display will make it easy to see what code was executed; the part you want is buried in all the executed code.
A PHP Test Coverage tool does exist and will provide nice displays of the covered code.
If you are looking for a debugger of some sort, have a look at XDebug or ZendDebugger.

Categories