Knowledge base web app -- need a demo mode - php

I was contracted to build an on-line knowledge base that searches and cross-references many thousands of replacement parts for a niche industry. My client furnishes this app to his customers on a subscription basis.
It uses MySQL and PHP and it works great. I want to deploy it in "demo mode" to sell my skills. I want the user to see the functions, but I have to guard the data for my client.
My first idea was to obfuscate the results. That's at cross-purposes with showing how well it searches. I'm considering a limit on how many searches you can perform, but that's awkward too as someone could visit every day and get more answers than we would prefer.
Other posts I've found are about letting people interact with an app, but without the challenge of protecting a big knowledge base.
Can you suggest an approach? (Note, I put the tag obfuscation, but not sure it applies because java code obfuscation seems to be unrelated.)
UPDATE 1: About obfuscation ... I've kind of wanted (or assumed, or fantasized about) on-the-fly obfuscation. Which is kind of hard in itself, I think. One answer so far implies a one-time scramble, which is probably how I need to approach this if I do it.
UPDATE 2: Thanks for the two warnings on legitimate use. This is all on the up-and-up! I'm as ethical as the day as long, and almost as ignorant.
UPDATE 3: I have two responses, both excellent quality. Chris L. got me to "think outside the box" and provided what seems to be the best solution.
FINAL: ... and there's not that much to show anyway!

Screen shots ( or something similar) are your best bet. They are quick, easy to browse, and no one has to really think about what they are doing as they are looking at them.
Make sure you have approval from your client.

Legally speaking, be very careful: taking a copy of a system of this sort (especially its data) and using it for your own purposes can get you into a lot of hot water depending on the contracts you signed and (under US law) whether or not the system was considered a work-for-hire.
That said, my personal preference for a system like this would be data obfuscation. Change the names and numbers associated with the different parts for which it searches in order to create a system which demonstrates equivalent functionality but for a different, wholly fictional industry. (Turn things into widgets, gadgets, whatzits and so forth.)
If a potential client shows sufficient interest, see if you can arrange for a limited-time demo account with your original client to demonstrate the system at full functionality.

Assuming you own the application itself, the issue is that data. "I am not a lawyer", but I would not use a clients data no matter how I obfuscated it. Generate data set from scratch.
Many years ago I was with a company and we took a client's data set, cleaned it up changed the names to protect the innocent, etc etc etc, and used it for screen shots. You would have thought its own mother couldn't have recognized it. Wrong. Some time later, the client whose data it was said to us "That's our data." Nobody got sued, and there were not even any hard feelings, but the fact is, no matter what you do to it it is still not YOUR data.
"I'm as ethical as the day as long, and almost as ignorant." Your good intentions may count for nothing if you screw up.
Good luck.

If I were in your shoes, I would simply create some sample data to populate the database and every other eventual content used.
Then, I would choose one or more of the following options to present the product to the client:
Screen Shots
Screen Casts
Real Demo
Screen casts are usually more effective than screen shots (the wow effect on the client), but they are a bit more tricky to create. Still, softwares such as Screenflow (Mac) make their creation easy and quick.
I would personally avoid data obfuscation. In the past, it turned out to be very difficult sometimes to explain the customer the data were obfuscated for the demo purposes only (even if this was explicitly stated). The reaction I got from the client was still of great perplexity.

Related

What might be the best way to benchmark a users PC, PHP or JS?

PHP - Apache with Codeigniter
JS - typical with jQuery and in house lib
The Problem: Determining (without forcing a download) a user's PC ability &/or virus issue
The Why: We put out a software that is mostly used in clinics, but can be used from home, however, we need to know, before they go to our mainsite, if their pc can handle the enormities of our web-based, browser-served software.
Progress: So far, we've come up with a decent way to test dl speed, but that's about it.
What we've done: In php we create about a 2.5Gb array of data to send to the user in a view, from there the view calculates the time it took to get the data and then subtracts the php benchmark from this time in order to get a point of reference of upload/download time. This is not enough.
Some of our (local) users have been found to have "crappy" pc's or are virus infected and this can lead to 2 problems. (1)They crash in the middle of preforming task in our program, or (2) their virus' could be trying to inject into our js thus creating a bad experience that may make us look bad to the average (uneducated on how this stuff works) user, thus hurting "our" integrity.
I've done some googling around, but most plug-ins or advice forums/blogs i've found simply give ways to benchmark the speed of your JS and that is simply not enough. I need a simple bit of code (no visual interface included, another problem i found with one nice piece of js lib that did this, but would take days to remove all of the authors personal visual code) that will allow me to test the following 3 things:
The user's data transfer rate (i think we have this covered, but if better method presented i won't rule it out)
The user's processing speed, how fast is the computer in general
possible test for infection via malware, adware, whatever maybe harmful to the user's experience
What we are not looking to do: repair their pc! We don't care if they have problems, we just don't want to lead them into our site if they have too many problems. If they can't do it from home, then they will be recommended to go to their nearest local office to use this software "in house" so to speak.
Further Explanation
We know your can't test the user-side stuff with PHP, we're not that stupid, PHP is mentioned because it can still be useful in either determining connection speed or in delivering a script that may do what we want. Also, this is not a software for just anyone on the net to go sign up and use, if you find it online, unless you are affiliated with a specific clinic and have a login name and what not, your not ment to use the sight, and if you get in otherwise, it's illegal. I can't really reveal a whole lot of information yet as the sight is not live yet. What I can say, is it mostly used by clinics/offices for customers to preform a certain task. If they don't have time/transport/or otherwise and need to do it from home, then the option is available. However, if their home PC is not "up to snuff" it will be nothing but a problem for them and make the 2 hours task they are meant to preform become a 4-6hour nightmare. Thus the reason, i'm at one of my fav quest sights asking if anyone may have had experience with this before and may know a good way to test the user's PC so they can have the best possible resolution, either do it from home (as their PC is suitable) or be told they need to go to their local office. Hopefully this clears things up enough we can refrain from the "sillier" answers. I need a REAL viable solution and/or suggestions, please.
PHP has (virtually) no access to information about the client's computer. Data transfer can just as easily be limited by network speed as computer speed. Though if you don't care which is the limiter, it might work.
JavaScript can reliably check how quickly a set of operations are run, and send them back to the server... but that's about it. It has no access to the file system, for security reasons.
EDIT: Okay, with that revision, I think I can offer a real suggestion - basically, compromise. You are not going to be able to gather enough information to absolutely guarantee one way or another that the user's computer and connection are adequate, but you can get a general idea.
As someone suggested, use a 10MB-20MB file and several smaller ones to test actual transfer rate; this will give you a reasonable estimate. Then, use JavaScript to test their system speed. But don't just stick with one test, because that can be heavily dependent on browser. Do the research on what tests will best give an accurate representation of capability across browsers; things like looping over arrays, manipulating (invisible) elements, and complex math. If there is a significant discrepancy between browsers, then use different thresholds; PHP does know what browser they're using, so you can give the system different "good enough" ratings depending on that. Limiting by version (like, completely rejecting IE6) may help in that.
Finally... inform the user. Gently. First let them know, "Hey, this is going to run a test to see if your network connection and computer are fast enough to use our system." And if it fails, tell them which part, and give them a warning. "Hey, this really isn't as fast as we recommend. You really ought to go down to the local clinic to perform this task; if you choose to proceed, it may take a lot longer than intended." Hopefully, at that point, the user will realize that any issues are on them, not on you.
What you've heard is correct, there's no way to effectively benchmark a machine based on Javascript - especially because the javascript engine mostly depends on the actual browser the user is using, amongst numerous other variables - no file system permissions etc. A computer is hardly going to let a browsers sub-process stress itself anyway, the browser would simply crash first. PHP is obviously out as it's server-side.
Sites like System Requirements Lab have the user download a java applet to run in it's own scope.

Challenge: maximize cost of obfuscation's reverse engineering

Disclaimer: Similar questions has been asked a number of times on SO, however this question is much more specific, and has not been adequately addressed so far.
We're developing a new packaged software, which, for business security reasons, must run on our customer's server, in PHP. The software is sold with a per-user end-license; price range is within $20-80 per user, target market is small (and web-savy) consultancies, and IT agencies.
To discourage piracy (eg. removing the user-license enforcement), we'd like to maximize the protection of the PHP code in any means technologically available, which does not inconvenience the user.
Let's break this down:
does not inconvenience the user: no additional server-side installs (no zend decoder, or other binaries). Has to run on a plain-vanilla shared PHP host out-of-the-box.
Maximize the protection: breaking the protection has to outweigh the cost of buying an additional license. That is, it has to take at least 3-5 working days for a professional hacker to remove the user license protection.
Any means technologically available: might call home, might use high-end crypto, might implement a c64 emulator.
To pro-actively address the so far highest-voted non-solutions:
NOT looking for perfect obfuscation, just extremely hard ones (defined as: have to take at least 3-5 working days to decrypt), OR other anti-piracy methods
NOT looking for "black-box" software packages, which I don't know how they work, and can't determine whether it fits our purpose; looking for algorithmic ,and out-of-the-box ideas.
NOT looking for license/law-side protection, we already have that covered.
We DO know, that given enough time, and focus, all obfuscation will be hacked sooner or later; we merely want this not to be the economical solution.
Given the above constraints, what methods, or ideas would you use to maximize anti-piracy measures?
Bounty-hunt: point goes for the hardest algorithmic method to reverse-engineer the code, given the constraints above.
Update / Bounty-hunt: I've accepted Ira Baxter's answer, mostly because the rest failed to answer the core question, and attempted to question the underlying assumptions (business, closed source, yadda yadda). Thanks all!
I think what you want to do is to transform the code algorithmically, to obfuscate not only what is executed, but also to obfuscate the data structures. We assume we start with a clean version of the program, produced by the developer. He always works wih the clean version. Obfuscation produces the to-ship version. Good obfuscation will produce a to-ship version with exactly the same functionality as the original, so no further testing is (arguably) needed.
For control flow scrambling, the idea is to take the nicely written code you have at the start, and push it through transformations that make static (and human) analysis of the decisions that control the flow difficult by multiplying the set of assumptions that have to analyzed. For instance, if you have two pointers, and store a value through one, can it affect the value seen by the other? Depending on whether the pointers are aliased on not, you can get two different answers. Now take N pointers, each of which may be aliased; you get 2^N possible aliasing relations. If the reader doesn't know the exact combination, he won't be able to determine if a decision might be true, false or conditional. Of course, the tool that generates this produces conditionals whose outcome it knows, because it designs (generates) the pointer rat's nest to produce a specific outcome.
See Code Obfuscation Literature Survey (not my paper), which discusses a variety of control flow and data flow obfuscation. This is likely not the most recent summary of what is possible, but its pretty instructive. You should note doing this kind of obfuscation has some impact on execution time.
What the papers on this topic make clear is that control and data flow obfuscated programs are extremely hard for static analyzers to "understand"; the papers provide/reference demonstrations of the algorithmic complexity of processing such obfuscated programs.
Now, you might argue that people aren't static analyzers and therefore don't suffer the same limitations. You might be right; Roger Penrose famously argues that people do not have the same constraints as Turing machines; the argument isn't settled by a long shot. But the entire foundation of encryption/hashing technology is built on essentially the same kind of computational complexity arguments. And to date, nobody has proven smart enough to crack these technologies in ways
that can be used in daily life by theives (good thing, or your bank accounts would be empty).
To do this to a PHP program, you need tools that can parse the PHP code, and carry out such transformations. Our DMS Software Reengineering Toolkit has robust PHP parsers, and can apply very complex transformations to code. To do this really well, you want to apply the transformations globally across all your code, not just on a file-by-file basis. We don't have this kind of obfuscation transformation implemented on PHP, but if you really wanted to do it, this would be the way. We have applied complex transformations to PHP programs for other commercial products that we sell.
When you are all done, ideally you'd compile this result to machine code, say using the HipHop compiler. (Just compiling would defeat some folks, but not the serious software engineers).
EDIT: Obfuscation != AntiPiracy is a theme in other answers. So how does obfuscation help?
First you need to deal with the anti-piracy issue. The obvious things to do are:
Add copyright comments to each file. These serve as warnings to theives. Not good ones.
Add copyright strings in various places and print them out occasionally;
these will end up in memory and play a roleif a pirate steals the code; he stole this string, too.
Add a string to your application saying, "licensed to ". This makes
your customer unenthusiastic about letting it be stolen.
Add a check to your application that it is running on the intended customer's machine.
(Since your app is intended to be very cheap, you'll probably need to automate
a registration process)
Have the application phone home with its machine ID occasionally.
Now, these steps prevent someone (legally and technically) from stealing your code.
If this is all you have, an unfazed pirate will simply remove the technical checks and its stolen.
It is very hard to prevent somebody from copying the bit stream that makes up
your product; computers are far too good at copying.
So your goal is to arrange for it to be hard for him to derive
value if he does, and that's where obfuscation comes in.
If the code is sufficiently obfuscated, he will have a difficult time locating the license check
and phone home mechansisms to disable them. (I suggest several checks, none of them always called, to make it hard for the theif to tell when he is successful.).
The obfuscation, well done, should protect the printing of the original
owner's name, which means the original owner will have some interest in prevent it from being
stolen as you'll name him along with pirate in any lawsuit.
If they defeat the licenses, copyright printing, and phone-home mechanisms,
and simply want to run it in the back room without telling you, you might be stuck.
(For $80.00, I can't imagine why they'd go to all this trouble just for this effect).
But many thieves want to modify the software to "improve" it, especially if they want your market. Serious obfuscation will prevent them for doing this; it will even
make it hard for them to add thier own license controls.
That limits the value pretty severely.
They may simply steal it and release it to world for free; your hope here is
the applicaton is hard to crack. If they succeed, your only good defense
is a continuing stream of upgrades that licensed owners get.
Obfuscation is a key to successful piracy defense, IMHO.
Obfuscation != Anti-piracy For instance you could have a heavily obfuscated class, but I can use reflection to see all methods that this class implements. I can then extend this class and override any methods that I don't like. Are you storing a secret? Because any secret value can be pulled from memory using a debugger.
3-5 days? Even with Zend-Guard it takes 3-5 seconds to break using some open source tool. Most obfuscation tools are very primitive and easy to break.
I'm sorry but I don't think there is a good solution for this.
The best anti piracy method is no method.
If you don't want to use tools such as zend, then you are better off doing absolutely nothing.
Take it from me you can waste more time and lose sales trying to stop pirates. you will only hurt yourself. Hey they don't care and its good fun, the harder you make it the more satisfaction they get in doing it. and once its done it will be available for all via a torrent. so no-one needs to repeat the effort.
Make a good application. make it work well. give Fantastic service and the customers you want will gladly pay. those customers you don't want will NEVER pay so don't waste time on them. And guess what, they actually become good advertising. people see your software on more sites they come looking for it.
So in effect you are getting free advertising.
So don't stress, don't waste your time and don't blame pirates if your software fails. blame yourself because you got too distracted trying to do the impossible
I wanted to add a little bit of my personal experience.
Back in the 90's I spent many months creating encryption techniques to reduce/prevent pirating of a heavily pirated piece of software, in the end I 'mostly' succeeded.
I used custom encryption, junk insertion, random number generators, cross module CRC checking, blah blah blah.
I used to hang out in the news group devoted to hacking my software and others like it and even struck up conversations. one polite fellow said "why are you wasting your time we do this for fun". but I was hooked. it was a competition.
If I had spent the time and effort on improving the software instead, I would have earned 10x the amount I thought I had lost to piracy.
It was a fools victory.
I thought about this a lot, and what you are asking is essentially impossible. You can obfuscate to no end and people will still steal your software. There is little you can do about it. If you write in code to call home, someone will strip it out and just put true in instead. Your best bet is to write quality software so people want to buy it. It's either that or use a commercial solution like ionCube or Zend.
Only a few things can really work. The most basic logic I can think of that would be effective (since this market sounds like it's fairly controlled, and finite) would be to use something similar to a licensing server, but with a two-way communication channel (that you can encrypt etc.. etc..).
Now, of course you can have someone disable that communication channel, but between the coding you will add to disable the software, and the fact that your company will be able to follow up with the client since you will know exactly who it is that is "down" that will help.
The third part of the logic, is for each license that is given out to play a role in generating the "checks" that will occur between the software and your licensing server. This means you generate, on-premise, unique hash codes that are used as part of the answer your software send back to the server. That pretty much rules out the hacking, because the hacker would have to know what algorith you are using to generate the licensing (since it is pre-generated, there is no logic to use to decipher it) and the hacker would have to feed you a licensing key.
The fourth step, optionally, would be to push updates to clients to refresh the security mechanisms you have in place and run "tamper" checks on your code, possibly periodically feed some sort of hash to be used in the logic your software uses to connect to the licensing server.
This still isn't perfect, someone "will" be able to clone a production machine, circumvent/redirect the licensing (and you won't know since it will be a copy) and try to work away at the check that you have in your code which require a license (as someone above mentioned, set all the logic to "True")... but you could definitly spend the time putting checks and encryption on your licensing system and make it a time-consuming and "risky" process. Unlesss.. as a final touch... you can have some deliverable from your product generated by your server (none of the code is in what the client has) and pushed to the software that has this licensing mechanism in place.. but i don't know how possible that is.
Artificial code bloat
By using post processors to automatically bloat the code and insert logic multipliers you make the code hard to modify
I use tags in the original source to indicate the type of code in each method and which code multiplier to use. Randomisers can help too, as each release looks very different
The code bloat is achieved by a variety of processes. e.g. repeating and random fiddling of variables before and after they are officially in scope. Lots of extra logic steps that will never get followed. Breaking single statements into many random small steps. Interlace these with as many other statements as possible as long as the final step is in the correct order. etc etc
The final and most important part of this process is to interlace key generation and call home processes through this mess, and to be part of this mess (remember the "random fiddling of variables before and after they are officially in scope") so that the time taken to remove the key generation and call home become unwieldy
The call home server has to act like a rolling code remote control so while the attacker might discover the call home functions, taking them out will result in incorrect initialisation values for general variables in general methods, and in as many cases as you can work with
Over time you can build the general purpose code re-parser, and a library of functions to mess the code up. Keep adding the code mess library to improve the obfuscation level
You need to have a well covering unit and integration test library to validate the code after being messed up
I have not done this with PHP, but with other languages with similar constraints as PHP
Note: This technique works fine for complex scientific software where there is large amounts of cryptic logic and maths anyway. It may not work so well for typical web sites like CMS's unless your code multipliers are very convincing
If I get this right, why not invest in a server to be delivered within the cost of the application, a server which can be placed at the customer, with only one port opened for http access, I mean with a $1000 you can get a machine that can work as a safe for your software. If anyone attempts to hack into it you will know.
Another solution might be:
Currently I am working for a huge company that has aprox 350 selling points(shops) all over the country. As we can not rely on internet connection 100% we have a server at each shop. This server handles the business required for actual selling and it is linked to a local database. The rest of the stuff sits at the head-office server. Now, the clerks have computers in front of them, and all these computers work with the application hosted on the local server, the catch on the local server is that a registry which knows if a certain service is placed locally (on the same machine) or remote (at the head office) and executes the call as required (over http from remote location or direct call from local service). Services can be placed anywhere (local or remote) and all one needs to do is to configure their location in the registry by simply entering one of the keywords : local,remote,application (application keyword means that the service is first called from remote and if it fails it is called locally). This way you can make an acceptable compromise. Highly necessary stuff can sit locally and the rest of the business logic can reside on your server where nobody can touch it.
The short answer is no, there is no way to obfuscate code in such a complex manner that it takes days to crack. The simple explanation: obfuscation is a two way process. It can be done and undone. If a computer can do it, a determined person can do it too.
Instead of wasting so much time on protecting your code, why not take the hint from the popular TV show 24 (side note: Should have never been canceled!). To ensure scripts weren't stolen or revealed to the public, they watermarked each with a number specific to cast member, director, producer, etc. You can do something similar with you scripts by "watermarking" each PHP file. This can be something as simple as changing the name of the variable to reflect a client ID or something as complex as spreading identifying characters over multiple variable and function values/names. Try working this identifier and/or parts of it into as many inconspicuous places in your scripts as possible. Only you can know the exact combination that creates the identifying information. This way if code is leaked you can sue the responsible party.
Just a suggestion, you might just want to add needed lines of code that don't really do anything, except it looks like it.

Multi-tier applications with PHP?

I am relatively new to PHP, but experienced Java programmer in complex enterprise environments with SOA architecture and multitier applications. There, we'd normally implement business applications with business logic on the middle tier.
I am programming an alternative currency system, which should be easy deployable and customizable by individuals and communities; it will be open source. That's why php/mysql seems the best choice for me.
Users have accounts, and they get a balance. also, the system calculates prices depending on total services delivered and total available assets.
This means, on a purchase a series of calculations happen; the balance and the totals get updated; these are derived figures, something normally not put into a database.
Nevertheless, I resorted to putting triggers and stored procedures into the db, so that in the php code none of these updates are made.
What do people think? Is that a good approach? My experience suggests to me that this is not the best solution, and prompts me to implement a middle tier. However, I would not even know how to do that. On the other hand, what I have so far with store procs seems to me the most appropriate.
I hope I made my question clear. All comments appreciated. There might not be a "perfect" solution.
As is the tendency these days, getting away from the DB is generally a good thing. You get easier version control and you get to work in just one language. More than that, I feel that stored procedures are a hard way to go. On the other hand, if you like that stuff and you feel comfortable with SPs in MySql, they're not bad, but my feeling has always been that they're harder to debug and harder to handle.
On the triggers issue, I'm not sure whether that's necessary for your app. Since the events that trigger the calculations are invoked by the user, those things can happen in PHP, even if the user is redirected to a "waiting" page or another page in the meantime. Obviously, true triggers can only be done on the DB level, but you could use a daemon thread that runs a PHP script every X seconds... Avoid this at all costs and try to get the event to trigger from the user side.
All of this said, I wanted to plug my favorite solution for the data access layer on PHP: Doctrine. It's not perfect, but PHP being what it is, it's good enough. Does most of what you want, and keeps you working with objects instead of database procedures and so forth.
Regarding your title, multiple tiers are, in PHP, totally doable, but you have to do them and respect them. PHP code can call other PHP code, and it is now (5.2+) nicely OO and all that. Do make sure to ignore the fact that a lot of PHP code you'll see around is total crap and does not even use methods, let alone tiers, and decent OO modelling. It's all possible if you want to do it, including doing your own (or using an existing) MVC solution.
One issue with pushing lots of features to the DB level, instead of a data abstraction layer, is that you get locked into the DBMS's feature set. Open source software is often written so that it can be used with different DBs (certainly not always). It's possible that down the road you will want to make it easy to port to postgres or some other DBMS. Using lots of MySQL specific features now will make that harder.
There is absolutely nothing wrong with using triggers and stored procedures and other features that are provided by your DB server. It works and works well, you are using the full potential of the DB, instead of simply relegating it to being a simplistic data store.
However, I'm sure that for every developer on here who agrees with you (and me), there are at least as many who think the exact opposite and have had good experiences with doing that.
Thanks guys.
I was using db triggers because I thought it might be easier to control transaction integrity like that. As you might realize, I am a developer who is also trying to get grip of the db knowledge.
Now, I see there is the solution to spread the php code on multiple tiers, not only logically but also physically by deploying on different servers.
However, at this stage of development, I think I'll stick to my triggers/sp solution, as that doesn't feel to be that wrong. Distributing on multiple layers would require me to redesign my app consistently.
Also, thinking open source, if someone likes the alternative money system, it might be easier for people to just change layout for their requirements, while I would not need to worry that calculations get wrong if people touch php code.
On the other hand, of course, I agree that db stuff might get very hard to debug.
The DB init scripts are in source control, as are the php files :)
Thanks again

Thumbs system on Urban Dictionary

I was thinking of implementing a thumbs system, but mine would require a registration thus ruling out the possibility of people voting more than once unless they create a new account to do so. So I was wondering about Urban Dictionary's thumb system. How does it work? I would imagine that my IP would be stored in a database, so people would not be able to vote more than once however IPs do change pretty often and especially when you're on an iPhone. Probably a combination of cookies and IP checking. Can anyone give me a better insight? What would they check for to ensure you don't vote more than once?
The reason I ask is because I may want to make my a public system instead. Maybe even a hybrid, similar to SO where you can ask a question before creating an account and then have the two linked together. I am using PHP and MySQL.
Almost always it's done with cookies. As you say, IPs can't be used (naively) as they change, or cover too many people (i.e. everyone in a given office, etc).
But online polls not reliable anyway, so don't get too concerned about solving a problem no-one cares about. You can implement more 'intelligent' rules but then you need to ask what benefit you are getting for all your work.
Personally, I would go with:
Cookies
Forced signup voting
Some sort of analysis of voting patterns
Because it goes without saying that people can just sign up constantly, to submit more votes. It really depends on what benefit people get from voting, and how much you care (in terms of time, which is, obviously, money).
I know urban dictionary allows for more than one vote per day. Once every six hours to be exact.

How to prevent resale of PHP source? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Do you have a strategy for this? If I sell a web-system to a client and in accordance with the legal agreement, the customer is not allowed to sell it to others, how can I be sure he doesn't do that anyway?
My first idea is some sort of key that must be in the root directory, and that file is only valid for that specific domain.
Other Ideas?
UPDATE 1
I agree that this is mainly a legal problem. But the facts are: I´ve got a client that buys this system from me to sell it to others. And he wants this system to function so it's easy for him to make his profit. The ability to package the web server and sell it is part of the specification.
UPDATE 2
Another one point of view is this. In that case it is hard to prove how much of the reselled software comes from my original system.
UPDATE 3
Obfuscating is not an option for me, a really hate it.
Some use an obfuscator like Zend Guard but honestly I think that technical solutions for this kind of problem are as doomed as DRM is for audio and video content. Fundamentally what you've giving them is meant to work so it's just a technical problem to make it work in ways you don't want.
Your recourses here are (imho) legal not technical. You have a contract with the client that lays out what they can and can't do. You have a good lawyer draft that contract. If they don't abide by it then you pretty much have to take them to court.
Don't count on any form of obfuscation or copy protection as any kind of guarantee.
This is particularly a problem for scripting languages because (Zend notwithstanding), they are fundamentallly plaintext distribution methods. Java and .Net and other bytecode compiled languages have a little more protection but they can be disassembled into intermediate code too (but obfuscation is more useful here). Truly compiled languages (eg c, C++) have the most protection of all since disassembling a 50 meg binary into assembler code typically isn't that useful.
Even then there are no guarantees. If you're not comfortable with that then you need to carefully select your clients, live with the potential breach of contract (and the possible enforcement that might compel you to pursue) or find another line of work.
I reckon the only way to be sure is to offer your product as a hosted solution so the client never has access to the code. If you build it with this goal in mind you can still have resellers and let them skin the system so it looks like their own product.
This works well where I work, in theory customers can licence the code to run on their own infrastructure, but it is priced at such a level that only big companies are prepared to pay, and big companies are on the whole more concerned with legal niceties so are less likely to just run off with your work.
People are very prepared happy to go with hosted solutions if the price is right, and it can have benefits for everyone. The customer doesn't have to worry about getting everything set up and they also have the security of knowing that if something does need tweaking we (the developers) are there to do it.
This is a social problem, not a technical one. You have copyright law on your side; no more should be needed. (Any and all technical solutions would be the equivalent of DRM, which is inherently ineffective.)
Regarding your update: So basically you become a DRM supplier for this client of yours.
So: Does the client understand that DRM is ineffective? Try educating them before wasting time on implementation.
If the client remains adamant, I'd take a long hard look at what current DRM vendors are doing. E.g. lots of handwaving, some obfuscation, and, erm... I don't know... what else do they do? Either way, you can be certain that any solution you implement will be undone in less than 10% of the time it took you to implement it - so spend as short an amount of time on this as you can get away with. (Before it was edited out, you wrote "It's in the spec" about "being sure that the system isn't sold on": this might mean you've agreed to build something which is technically impossible (you can never be sure), and would require you spending an infinite amount of time building something which comes close...)
You might investigate having the application contact some central registry when run for the first time (with embedded fingerprint, different for each sale, so you know who passed on their code). That way your client can find out where the application is being run, and has a chance of contacting those who use it without permission. (Potentially turning them into new paying customers.) Maybe give said central repository the ability to send a kill-signal back? That gets really scary though, and liability concerns would be huge; avoid if at all possible.
Obfuscating the source is more trouble than it is worth, in my experience, unless you are trying to keep some complicated algorithm secret.
I would suggest doing the following:
Make sure you and your client and your lawyers all understand and agree with your contract.
Insert a short copyright statement as a comment in every source file.
Insert copyright notices into the generated web pages (via page templates or php code) as HTML comments, so a 'view source' will prove that your code is being in an unlicensed application.
If you're really worried, and this isn't an intranet-only app, you might expand on (3) and insert unique hidden text into the pages that is seen by Google but not by users.
None of this will stop a determined thief, but will help deter and detect "accidental" thefts.
The proper way of prohibiting re-sale of your software is via legal constraints, not technical ones. Have your customer sign a contract where they agree not to re-sell.
Technical prevention measures universally make the product worse, also in the technical sense, and that lessens the value to the customers. The stronger the technical protection is, the bigger the nuisance.
For example, suppose the customer legitimately wants to change their domain name. Should they have to buy a new copy? I think not. If you tell them how to change the keyfile to match their new domain, they can then use that information to enable them to re-sell. However, the legal protection applies regardless of what technical tricks they come up with.
But a problem is when you aren't afraid of the customer reselling what you have done, out of the box, which can be tracked by lawyers. The problem can be that the customer is refactoring it. I mean take my many hours of work and change a couple of things and call it his... Sell it for a small amount cheaper and win the business...
That is why I am looking at technical solutions for protecting my work. It will also possibly help me to keep the invoicing fromo lawyers to a minimum, which is a substantial amount of change from having him/her to protect my work.
How can I be sure he doesn´t do that anyway?
You can't prevent it...period. If anyone has the source there is no way to stop them...you can only then resort to punishing them if they do.
Perhaps your contract, besides forbidding them from reselling it, has a price associated with them reselling it, i.e. something like 10x or 20x what you would normally pay, plus legal expenses if any required to get them to payup...that way, if they choose to do it anyway, you have a nice piece of paper, with their signature on it that already has a nice fat pre-agreed upon price that they need to pay should they go ahead and sell it.
I haven't seen mention of Ioncube and so was wondering if there is a reason for not using it?
Yes it costs money to set up and yes it requires a server side library to be installed (I daresay most hosts these days have it already running) but it does allow for domain restrictions as well as time based restrictions.
Maybe you could even use it in conjunction with PHPAudit?

Categories