I'm in a situation where I have the class "progress" - trouble is that I'm interested in incorporating some open source that also has the class "progress". I remember very vaguely some reference to an app that's able to inspect the various files in a development for a given class name ... and either return it ... or change it. Have you experience of this?
thanks
Some languages (C++, C#) have a notion of namespace. Java has packages. Python has modules. They are desgined to prevent class name collision. Assume you're talking about Java, the Open Source class called "Progress" probably is in its own package, and your "Progress" is in your own package. So maybe you are not having a problem as you thought you do.
Having said that, I'm guessing you're asking about a feature in an IDE which is called refactor that would allow you to change your class name everywhere that refers or has dependencies on it.
Many IDEs (e.g. Eclipse) can do this, but it's more common for Java then PHP. Because of the dynamic nature of PHP variables, it's a little harder for IDEs to know exactly what to change. For example, you could do this:
$name = "prog";
$name .= "ress";
$cls = new $name();
The IDE would have to actually run your app to know that $name contains "progress" at that particular moment. Even then there would be no way to automatically fix it.
So sadly, you're stuck with find-and-replace. Don't worry though; changing class names during development is a good thing. It forces you to re-think exactly what the class is for.
Always prefix class names with the product/project abbreviation. Why not use TextPad to replace all occurrences? That way you won't be confused in future which class it is. I would also recommend using namespace/package technique but that would make you rewrite a lot of code I assume.
http://en.wikipedia.org/wiki/Code_refactoring
The above link has a some tools used for refactoring.
Related
I designed a PHP 5.5+ framework comprised of more than 750 different classes to make both web applications and websites.
I would like to, ideally, be able to reduce its size by producing a version of itself containing just the bare essential files and resources needed for a given project (whether it's a website or a web application).
What I want to do is to be able to:
reduce the amount of traits, classes, constants and functions to the bare essential per project
compress the code files to achieve a lesser deployment size and faster execution (if possible)
So far, I've got the second part completed. But the most important part is the first, and that's where I'm having problems. I have a function making use of get_declared_classes() and get_declared_traits(), get_defined_constants() and get_defined_functions() to get the full list of user-defined classes, traits, functions and constants. But it gives me pretty much EVERYTHING and that's not what I want.
Is there a way to get all defined classes, functions and constants (no need for traits as I could run class_uses() on every class and get the list of traits in use by that class) for a single given script?
I know there's the token_get_all() function but I tried it with no luck (or maybe it's I'm using it the wrong way).
Any hint? Any help would be greatly appreciated :)
You can use PHP Parser for this. It constructs abstract syntax trees based on the files you supply to it. Then you can analyze its output for each file, and produce a report usable to you.
Other than that, you can use token_get_all() approach you've mentioned already, and write a small parser yourself. Depending on your project, this might be easier or more difficult. For example, do you use a lot of new X() constructs, or do you tend to pass dependencies via constructors?
Unfortunately, these are about the only viable choices you have, since PHP is dynamically typed language.
If you use dependency injection, however, you might want to take a look at your DI framework's internal cache files, which often contain such dependency maps. If you don't use such framework, I recommend to start doing this, especially since your project is big and that's where dependency injection excels at. PHP-DI, one of such frameworks, proved to be successful in some of my middle-size projects (25k SLOC).
Who knows? Maybe refactoring your project to use DI will let you accomplish the task you want without even getting to know all the dependencies. One thing I'm sure of is that it will help you maintain it.
I have been handed over a large undocumented code of a application written in php as the original coder went AWOL. My task is to add new features but I can't do that without understanding the code.I started poking around. honestly, I am overwhelmed by the amount of source code. I have found:
Its well written based upon MVC architecture, DB persistence, Templating & OOP
modular, there is concept of URL based routing,basic templating
Uses custom written php framework which has no documentation.And there no source control history(oops!)
there over 500 files, with each file containing hundreds of line of code. And every file has 3-4 require_once statements which include tons of other files, so its kinda hard to tell which function/class/method is coming from where
Now I am looking for some techniques that I use to understand this code. for example, consider the following code snippet:
class SiteController extends Common {
private $shared;
private $view;
protected function init(){
$this->loadShared();
$this->loadView();
}
private function loadShared(){
$this->shared = new Home();
}
private function loadView(){
$this->view = new HomeView();
}
I want to know
where HomeView() & Home() are defined? Where does $this->shared & this->view come from? I checked the rest of the file, there is no method named shared or view. so obviously, they coming from one of hundreds of classes being included using require_once() But which one? how can I find out?
Can I get a list of all the functions or methods that are being executed? If yes, then how?
this class SiteController overrides a base Common class. But I unable to find out where is this Common class is located. How to tell?
Further, Please share some techniques that that be used to understand existing code written in php?
First, in this kind of situation, I try to get an overview of the application : some kind of global idea of :
What the application (not the code !) does
How the code is globally organized : where are the models, the templates, the controllers, ...
How each type of component is structured -- once you know how a Model class works, others will typically work the same way.
Once you have that global idea, a possibility to start understanding how the code works, if you have some time before you, is to use a PHP Debugger.
About that, Xdebug + Eclipse PDT is a possibility -- but pretty much all modern IDEs support that.
It'll allow you to go through the generation of a page step by step, line by line, understanding what is called, when, from where, ...
Of course, you will not do that for the whole application !
But as your application uses a Framework, there are high chances that all parts of the application work kind of the same way -- which means that really understanding one component should help understanding the other more easily.
As a couple of tools to understand what calls what and how and where, you might want to take a look at :
The inclued extension (quoting) : Allows you trace through and dump the hierarchy of file inclusions and class inheritance at runtime
Xdebug + KCacheGrind will allow you to generate call-graphs ; XHProf should do the same kind of thing.
Using your IDE (Eclipse PDT, Zend Studio, phpStorm, netbeans, ...), ctrl+click on a class/method should bring you to its declaration.
Also note that an application is not only code : it often find very useful to reverse-engineer the database, to generate a diagram of all tables.
If you are lucky, there are foreign keys in your database -- and you'll have links between tables, this way ; which will help you understand how they relate to each other.
You need an IDE. I use netbeans for PHP and it works great. This will allow you to find out where the homeview/home classes are by right clicking and selecting a "find where defined" option or something similar.
You can get a list. This is called the stack. Setting up a debugger like xdebug with the IDE will allow you to do this.
grep is the only thing makes me survive such codez
Look inside of the script where you found this code snippet for additional included or required pages that PHP imported into the main script. Those scripts should define those classes that are being instantiated.
Sorry, not sure if you can find which functions/methods have been executed. I know you can find if they exist, and you can find the generated output of them... but not sure if they have been executed.
It is important to note that SiteController doesn't override, the Common class, but it extends, or builds on top of it, like how a building is built on a foundation. The Common class is the foundation. Again, check the included and required scripts to see where Common was defines.
Hope that helps,
spryno724
I would start with:
throwing exception at certain points to see a stacktrace where the call originated.
grep for Class Common for example
create a directory listing to get a feeling for the organization of the software
use get_included_files(); to see what is actually used for a certain call
Start documenting what I find out
Start working with an IDE, like NetBeans, Eclipse or Zend Studio
Figuring out class hierarchies with maybe this "php: determining class hierarchy of an object at runtime" approach
You seem to realize that you can't read/digest every file, so you've got to focus on the important ones. Looks like you've started that process with SiteController.
Hopefully between reading the requires and using your IDE you can chase down the Home() and HomeView()
There might be a few key XML files that dictate the mappings from URLs to controller files, so you'll want to figure out how they work also.
I've worked with a poorly documented (but decently working) custom framework before, and your situation seems pretty similar. I found things pretty smooth once I understood the main controller and basically formed an understanding for how URL requests were processed.
1) You can use a search tool such as grep to find code, including definitions. But on a big code base, grep is slow, and it gives a lot of false positives because it has no understanding of the PHP language.
Our Search Engine is a GUI-based tool that indexes your source code to achieve extremely fast lookup, indexing by the langauge elements (variable names, constants, keywords, strings, ..) and allowing to formulate queries that honor the langauge structure (e.g., it ignores whitespace and comments unless you say you want to see them). A query shows hits in a hit window, and a click takes you to the file/line in which the hit occurs. With some tiny bit of additional configuration, you can go from the code window into your favorite editor.
2) Sometimes you want to know where specific functionality exists, but you have no clue what to search for. Here a test coverage tool can really help. Simple set up test coverage for the (working) application, and exercise the functionality manually; what is "covered" is potentially the code you care about. Exercise something which is NOT the feature; what is covered is NOT the code you want. This is way easier than trying to run a debugger to find the code of interest. Our PHP Test Coverage tool can provide you this coverage, and not only show you the covered code in GUI, but also do that "coverage subtraction" so that you can see just the relevant code.
Start from the entry point of the application (usually index.php) and go deeper on what gets called when.
Give PHPstorm a go, it's an ide with excellent code analyzing features, can go to definition of any class and variable, show inheritance hierarchy, find usages and many other useful stuff.
I'll also plug my own tool:
http://raveren.github.io/kint/
It's works with zero set up and is extremely useful to get a grip on what's going on where. Use Kint::trace(); to see a pretty execution backtrace and d(get_defined_vars()); to see what is defined in the current context and eventually you'll get there.
Screenshot:
(source: github.io)
I am using PHP in eclipse. It works ok, I can connect to my remote site, there is colour coding of code elements and some code hints.
I realise this may be too long to answer all questions, if you have a good answer for one part, answering just that is ok.
Firstly General Coding
I have found that it is easy to
loose track of included files and
their variables. For example if
there was a database $cursor it is
difficult to remember or even know
that it was declared in the included
file (this becomes much worse the
more files you include). How are
people dealing with this?
How are people documenting their
code - in particular the required
GET and POST data?
Secondly OO Development:
Should I be going full OO in my
development. Currently I have a
functions library which I can
include and have separated each
"task" into a separate file. It is a
bit nasty but it works.
If I go OO how do I structure the
directories in PHP, java uses
packages - what about php?
How should I name my files, should I
use all lower case with _ for spaces
"hello_world.php"? Should I name
classes with Uppercase like Java
"HelloWorld.php"? Is there a
different naming convention for
Classes and regular function files?
Thirdly Refactoring
I must say this is a real pain. If
I change the name of a variable in
one place I have to go through whole
document and each file that included
this file and change the name their
too. Of course, errors everywhere
is what results. How are people
dealing with this problem? In Java
if you change the name in one place
it changes everywhere.
Are there any plugins to improve php
refactoring? I am using the
official PHP version of Eclipse from
their website.
thanks
Firstly General Coding
1) OO can help you with that. As you encapsulate variables and functionality, they don't go out and mess with the namespaces. Assumind I understand right what problem you hint at, using an OO approach helps alleviating conflicts that can arise when you are inadvertedly redeclaring varables. (Note: Alleviate. Not completely prevent on its own. ;))
Otherwise a practise i have encounterd is prepending variable names with something like a 'package name' -- which merely shifts the problem one level up and isn't exactely beautiful either. :|
2) "However suits their purpose". PHPdoc is a good start; will help to create API documentation.
Secondly OO Development:
3) As said before -- "it depends". Do it when needed. You don't have to go full OO for "hello world". But you can. Weigh the costs and benefits of either route and choose wisely. Though I personally want to suggest when in doubt favour OOP over 'unstructured' approaches. Basically, know your tools and when to use them -- then you can make that call on your own easily. :)
4) As far as I can see, the directories "are structured like packages". Mind you, "directories" and "like". Having said that, various frameworks have solved that problem for theirselves; cf; th eother answers.
5) Again, however you please. There is not a definitive way You Have To Do It Or Else. Just stick to it once you chose your path ;3
Aside of that certain frameworks etc. have their own naming conventions. Symfony, e.g., uses CamelCase like Java.
Thirdly Refactoring
I must say this is a real pain.
yes :3 But it pays off.
If I change the name of a variable in one place I have to go through whole
document and each file that included this file and change the name their too.
Of course, errors everywhere is what results. How are people dealing with
this problem? In Java if you change the name in one place it changes everywhere.
No, it doesn't. If you get yourself a tool with support you only have to use the refactoring tool once; but if you rename a class property in java, there is no magic bot that walks through the internet and automagically makes sure everyone on the planet uses the new name. ;)
But as for how to prevent it -- be smart. Honour program contracts, i.e. use interfaces. Do not use functions / members you shouldn't use directly. Watch the hierarchies. Use a reasonable division of code and respect this division's boundaries.
But how people deal with that problem? Well, search and replace I suppose ;)
As for the Eclipse-Plugin -- The dynamic nature of PHP makes it more difficult to automagically refactor code; we can't always use static type hinting etc., and divination of argument and return types is impossible more often than not. So, to the extent of my knowledge, 'automatic refactoring' is not as well-supported by tools as in the Java world. Though I am sure for the doable cases, there should be plugins. :)
I've found using a PHP framework (e.g. Zend, Cake, CodeIgniter, etc) can force class structures and naming conventions while generally addressing autoloading as well. Using PHPDoc formatting liberally helps with code-completion and hinting as well as documenting specific requirements (e.g. method parameter definitions).
For the OO Development part:
I am using the autoload functionality to load the classes dynamically. My directory structure is like packages in java. My classes are named like in java (e.g. HelloWorld.php). But the class is defined with the complete path to that class (e.g. class FW_package1_package2_HelloWorld {...}).
If a class is called the autoload method replaces all _ against / and searches for the class with the extracted path (e.g. FW/package1/package2/HelloWorld.php).
I am strongly influenced by Java, so that I chose this way.
Take a look at nWire for PHP. It is a plugin for Eclipse PDT which provides code exploration and visualization.
It can easily be used to trace dependencies within your application and it is very handy for OO projects, enabling you to visualize class hierarchies and much more.
It doesn't support refactoring, but it can assist by showing you the references of a given components (e.g. a function or a field).
I've never been able to figure this out. If your language doesn't type-check, what benefits do interfaces provide you?
Interfaces cause your program to fail earlier and more predictably when a subclass "forgets" to implement some abstract method in its parent class.
In PHP's traditional OOP, you had to rely on something like the following to issue a run-time error:
class Base_interface {
function implement_me() { assert(false); }
}
class Child extends Base_interface {
}
With an interface, you get immediate feedback when one of your interface's subclasses doesn't implement such a method, at the time the subclass is declared rather than later during its use.
Taken from this link (sums it up nicely):
Interfaces allow you to define/create
a common structure for your classes –
to set a standard for objects.
Interfaces solves the problem of
single inheritance – they allow you
to inject ‘qualities’ from multiple
sources.
Interfaces provide a flexible
base/root structure that you don’t
get with classes.
Interfaces are great when you have
multiple coders working on a project
; you can set up a loose structure
for programmers to follow and let
them worry about the details.
I personally find interfacing a neat solution when building a DataAccess layer which has to support multiple DBMS's. Each DBMS implementation must implement the global DataAccess-interface with functions like Query, FetchAssoc, FetchRow, NumRows, TransactionStart, TransactionCommit, TransactionRollback etc. So when you're expanding your data-acccess posibilities you are forced to use a generic defined functionschema so you're application won't break at some point because you figured the function Query should now be named execQuery.
Interfacing helps you develop in the bigger picture :)
Types serve three distinct functions:
design
documentation
actual type checking
The first two don't require any form of type checking at all. So, even if PHP did no checking of interfaces, they would still be useful just for those two reasons.
I, for example, always think about my interfaces when I'm doing Ruby, despite the fact that Ruby doesn't have interfaces. And I often wish I could have some way of recording those design decisions in the source code.
On the other hand, I have seen plenty of Java code that used interfaces, but clearly the author never thought about them. In fact, in one case, one could see from the indentation, whitespace and some leftover comments in the interface that the author had actually just copied and pasted the class definition and deleted all method bodies.
Now to the third point: PHP actually does type check interfaces. Just because it type checks them at runtime doesn't mean it doesn't type check them at all.
And, in fact, it doesn't even check them at runtime, it checks them at load time, which happens before runtime. And isn't "type checking doesn't happen at runtime but before that" pretty much the very definition of static type checking?
You get errors if you haven't added the required methods with the exact same signature.
Interfaces often used with unit-testing (test-driven design).
it also offers you more stable code.
the interfaces are also used to support iterators (eg. support for foreach on objects) and comparators.
It may be weakly typed, but there is type hinting for methods: function myFunc(MyInterface $interface)
Also, interfaces do help with testing and decoupling code.
Type hinting in function/method signatures allows you to have much more control about the way a class interfaces with it's environment.
If you'd just hope that a user of your class will only use the correct objects as method parameters, you'll probably run into trouble. To prevent this, you'd have to implement complicated checks and filters that would just bloat your code and definitely have would lower your codes performance.
Type hinting gives you a tool to ensure compatibility without any bloated, hand written checks. It also allows your classes to tell the world what they can do and where they'll fit in.
Especially in complex frameworks like the Zend Framework, interfaces make your live much easier because they tell you what to expect from a class and because you know what methods to implement to be compatible to something.
In my opinion, there's no point, no need and no sense. Things like interfaces, visibility modifiers or type hints are designed to enforce program "correctness" (in some sense) without actually running it. Since this is not possible in a dynamic language like php, these constructs are essentially useless. The only reason why they were added to php is make it look more like java, thus making the language more attractive for the "enterprise" market.
Forgot to add: uncommented downvoting sucks. ;//
Good design dictates only writing each function once. In PHP I'm doing this by using include files (like Utils.php and Authenticate.php), with the PHP command include_once. However I haven't been able to find any standards or best practices for PHP include files. What would you at StackOverflow suggest?
I'm looking for:
Naming Standards
Code Standards
Design Patterns
Suggestions for defining return types of common functions
(now I'm just using associative arrays).
One convention I like to use is to put each class in its own file named ClassName.class.php and then set up the autoloader to include the class files. Or sometimes I'll put them all in a classes/ subdirectory and just name them ClassName.php. Depends on how many class vs. non-class includes I'm expecting.
If you organize your utility functions into classes and make them static methods instead, you can get away with writing only a single require_once() in your top level files. This approach may or may not be appropriate for your code or coding style.
As for return types, I try to follow the conventions used in the built-in functions. Return a type appropriate to the request, or return false on failure. Just make sure you use the === operator when checking for false in the results.
The fact that you're concerned about conventions suggests you're already on the right track. If you are familiar with any other OOP language like Java, C++, C#, etc., then you'll find you can follow a lot of the same conventions thanks to the OOP goodness in PHP5.
Whatever naming convention you end up using (I prefer to take cues from either Java or C# wherever possible) make sure if you use include files for functions that they do not actually execute any code upon including, and never include the same file twice. (use include-once or require-once)
Some such standards have been written already. Most large projects will follow and standard of their own.
Here is one written by Zend and is the standard used in the Zend framework.
http://framework.zend.com/manual/en/coding-standard.html
Also, PEAR always had some fairly strict coding standards:
http://pear.php.net/manual/en/standards.php
My preferred answer though is that for your own project you should use what you feel comfortable with, and be internally consistent. For other projects, follow their rules. The consistency allows for greatest code readability. My own standards are not the same as the PEAR ones. I do not indent with four spaces (I use tabs) and I never use camel case like function names, but nonetheless if I am editing something from another project I'll go with whatever that project does.
I've done the following. First, I created an intercepting filter, to intercept all web requests, I also created a version which would work with command line commands.
Both interceptors would go to a boot strap file, which would setup an autoloader. This file as the autoloading function and a hash. For the hash the key is the class name, and the value is the file path to the class file. The autoload function will simply take the class name and run a require on the file.
A few performance tips if you need them, use single quotes in defining the file, as they're slightly faster since they're not interpreted, also use require/include, instead of their _once versions, this is guaranteed to run once, and the former is a fair bit faster.
The above is great, in fact, even with a large code base with a tonne of classes, the hash isn't that big and performance has never been a concern. And more importantly we're not married to some crazy pseudo name space class naming convention, see below.
The other option is delimited name, pseudo name space trick. This is less attractive as name spaces will come with 5.3 and I see this being gross as renaming these across the code base will be less fun. Regardless, this is how it works, assume a root for all your code. Then All classes are named based on the directory traversal required to get there, delimited by a character, such as '_', and then the class name itself, the file will be named after the class, however. This way the location of the class is encoding in the name, and the auto loader can use that. The problem with this method besides really_long_crazy_class_names_MyClass, is that there is a fair bit of processing on each call, but that might be premature optimisation, and again name spaces are coming.
eg.
/code root
ClassA ClassA.php
/subfolder
subFolder_ClassB ClassB.php