How can I test a bot detection script?

How can I test a bot detection script? - php

I have written a bot detection script using PHP. I want to test that script by sending bots to click on links so I can know if the script works. How can I do that?
Here is the PHP code:
function bot_detected($USER_AGENT){
$crawlers = array(
'Googlebot',
'msnbot',
'Yahoo',
'Lycos',
'facebookexternalhit'
);
$crawlers_agents = implode('|', $crawlers);
if(strpos($crawlers_agents, $USER_AGENT) === false){
return false;
} else {
return TRUE;
}
}

You could use a chrome plugin to set a custom user agent so you can test it, for example this one

The heart of testing is simulating the environment.
First, download a testing suite. I recommend and use PHPUnit. This will allow you to write tests that can survive code alterations that exist in separate files. Without a testing suite, you will inevitably write a program called a driver and do the same thing, but driver files often get lost or are forgotten about, because each one is coded on an as-needed basis, and their is usually no system put in place as to storing drivers together or using a consistent and predictable naming schema. For these reasons, I recommend learning a testing suite, like PHPUnit, which will force you to think about test longevity and file name conventions.
Once you have a testing suite selected, start by designing your test suite. Your short program is really just a function call, so you need the test to pass multiple values to your function, and then test the response to ensure that you get the predicted result.
In a hybrid PHP-pseudocode, this might look like the following:
require 'myfile.php'
class MyTest extends TestClass{
/**
* Provides parameters and expected results to the test method.
*/
public function providerOfTestCases(){
return [
'Googlebot Test Case' => [ 'Googlebot', true ],
'msnbot Test Case' => [ 'mstbot' , true ],
.
.
'nonbot test case' => [ 'randomStringData', false ]
];
}
/**
* #dataProvider providerOfTestCases
*/
public function testBotDetector( $userString, $expectedResult ){
$functionResult = bot_detected( $userString );
$message_on_failure = "When testing $userString, we expect "
. ( $expectedResult ? "TRUE" : "FALSE" )
. " but instead the function outputs "
. ( $functionResult ? "TRUE" : "FALSE" );
$this->assertEquals( $expectedResult, $functionResult, $message_on_failure );
}
}
This test, for such a simple function, will tell you mostly what you already know, that for each string in your list of bot-names that you get a TRUE result.
In addition to this, I would add a logging function to your production system to keep track of all the $USER values that are tested. The biggest problem with a function like the one you have written is that it relies on your preset data list to be accurate. There is no way to test, in advance, that the values you have listed are actually the values that are delivered to your system. By logging all the values that are tested, you can regularly inspect the log for new values that should be considered and possible mistakes.
This second process relies on the comment made by #RiggsFolly to your original post. Your log files will only be filled by actual bot visits, so you must be patient as you wait for the logs to fill. Check the logs regularly, and make sure that you are seeing the values that you expect to see.
Remember to include in your log the result of your function output so that you can tripple-check the performance of your function.
I hope all of this has been helpful. Happy coding!

Related

Dead lettering with php-amqplib and RabbitMQ?

I'm just starting out in using php-amqplib and RabbitMQ and want a way to handle messages that, for whatever reason, can't be processed and are nack'd. I thought that one way people handle this is with a dead letter queue. I'm trying to set this up but have not had any luck so far and hope someone could offer some suggestions.
My initiation of the queues looks a little something like:
class BaseAbstract
{
/** #var AMQPStreamConnection */
protected $connection;
/** #var AMQPChannel */
protected $channel;
/** #var array */
protected $deadLetter = [
'exchange' => 'dead_letter',
'type' => 'direct',
'queue' => 'delay_queue',
'ttl' => 10000 // in milliseconds
];
protected function initConnection(array $config)
{
try {
$this->connection = AMQPStreamConnection::create_connection($config);
$this->channel = $this->connection->channel();
// Setup dead letter exchange and queue
$this->channel->exchange_declare($this->deadLetter['exchange'], $this->deadLetter['type'], false, true, false);
$this->channel->queue_declare($this->deadLetter['queue'], false, true, false, false, false, new AMQPTable([
'x-dead-letter-exchange' => $this->deadLetter['exchange'],
'x-dead-letter-routing-key' => $this->deadLetter['queue'],
'x-message-ttl' => $this->deadLetter['ttl']
]));
$this->channel->queue_bind($this->deadLetter['queue'], $this->deadLetter['exchange']);
// Set up regular exchange and queue
$this->channel->exchange_declare($this->getExchangeName(), $this->getExchangeType(), true, true, false);
$this->channel->queue_declare($this->getQueueName(), true, true, false, false, new AMQPTable([
'x-dead-letter-exchange' => $this->deadLetter['exchange'],
'x-dead-letter-routing-key' => $this->deadLetter['queue']
]));
if (method_exists($this, 'getRouteKey')) {
$this->channel->queue_bind($this->getQueueName(), $this->getExchangeName(), $this->getRouteKey());
} else {
$this->channel->queue_bind($this->getQueueName(), $this->getExchangeName());
}
} catch (\Exception $e) {
throw new \RuntimeException('Cannot connect to the RabbitMQ service: ' . $e->getMessage());
}
return $this;
}
// ...
}
which I thought should set up my dead letter exchange and queue, and then also set up my regular exchange and queue (with the getRouteKey, getQueueName, and getExchangeName/Type methods provided by extending classes)
When I try to handle a message like:
public function process(AMQPMessage $message)
{
$msg = json_decode($message->body);
if (empty($msg->payload) || empty($msg->payload->run)) {
$message->delivery_info['channel']->basic_nack($message->delivery_info['delivery_tag'], false, true);
return;
}
// removed for post brevity, but compose $cmd variable
exec($cmd, $output, $returned);
if ($returned !== 0) {
$message->delivery_info['channel']->basic_ack($message->delivery_info['delivery_tag']);
} else {
$message->delivery_info['channel']->basic_nack($message->delivery_info['delivery_tag']);
}
}
But I get back the error Something went wrong: Cannot connect to the RabbitMQ service: PRECONDITION_FAILED - inequivalent arg 'x-dead-letter-exchange' for queue 'delay_queue' in vhost '/': received 'dead_letter' but current is ''
Is this the way I should be setting up dead lettering? Different examples I've seen around all seem to show a bit of a different way of handling it, none of which seem to work for me. So I've obviously misunderstood something here and am appreciative of any advice. :)

Setting up (permanent) queues and exchanges is something you want to do once, when deploying code, not every time you want to use them. Think of them like your database schema - although the protocol provides "declare" rather than "create", you should generally be writing code that assumes things are configured a particular way. You could build the first part of your code into a setup script, or use the web- and CLI-based management plugin to manage these using a simple JSON format.
The error you are seeing is probably a result of trying to declare the same queue at different times with different parameters - the "declare" won't replace or reconfigure an existing queue, it will treat the arguments as "pre-conditions" to be checked. You'll need to drop and recreate the queue, or manage it via the management UI, to change its existing parameters.
Where run-time declares become more useful is when you want to dynamically create items in your broker. You can either give them names you know will be unique to that purpose, or pass null as the name to receive a randomly-generated name back (people sometimes refer to creating an "anonymous queue", but every queue in RabbitMQ has a name, even if you didn't choose it).
If I'm reading it correctly, your "schema" looks likes this:
# Dead Letter eXchange and Queue
Exchange: DLX
Queue: DLQ; dead letter exchange: DLX, with key "DLQ"; automatic expiry
Binding: copy messages arriving in DLX to DLQ
# Regular eXchange and Queue
Exchange: RX
Queue: RQ; dead letter exchange: DLX, with key "DLQ"
Binding: copy messages from RX to RQ, optionally filtered by routing key
When a message is "nacked" in RQ, it will be passed to DLX, with its routing key overwritten to be "DLQ". It will then be copied to DLQ. If it is nacked from DLQ, or waits in that queue too long, it will be routed round to itself.
I would simplify in two ways:
Remove the dead letter exchange and TTL from the "dead letter queue" (which I've labelled DLQ); that loop's likely to be more confusing than useful.
Remove the x-dead-letter-routing-key option from the regular queue (which I've labelled RQ). The configuration of the regular queue shouldn't need to know whether the Dead Letter Exchange has zero, one, or several queues attached to it, so shouldn't know the name of that other queue. If you want nacked messages to go straight to one queue, just make it a "fanout exchange" (which ignores routing keys) or a "topic exchange" with the binding key set to # (which is a wildcard matching all routing keys).
An alternative might be to set x-dead-letter-routing-key to the name of the regular queue, i.e. to label which queue it came from. But until you have a use case for that, I'd keep it simple and leave the message with its original routing key.

Laravel Unit testing controllers error when using multiple methods

I am trying to test some of my controllers through Unit Testing. But there is something strange happening. With the following code in my testcase:
public function test_username_registration_too_short()
{
$result = $this->action('POST', 'App\\Controllers\\API\\UserController#store', null, [
'username' => 'foo'
]);
$this->assertEquals('not_saved', $result->getContent());
// $result = $this->action('POST', 'App\\Controllers\\API\\UserController#store', null, [
// 'username' => 'foo'
// ]);
// $this->assertEquals('not_saved', $result->getContent());
}
public function test_username_registration_too_short_run_2()
{
$result = $this->action('POST', 'App\\Controllers\\API\\UserController#store', null, [
'username' => 'foo'
]);
$this->assertEquals('not_saved', $result->getContent());
}
When I run this, the initial too_short test passes, but the exact same code on run 2 does not pass (it even manages to save the user). But if I put that same code twice in the same method (what is commented out now) it works perfectly? I have nothing in my setUp or tearDown methods. And I am a bit lost here.
The code in the controller is the following:
$user = new User(Input::all());
if($user->save())
{
return 'saved';
}
return 'not_saved';

I'm not going to stop repeating myself over this question. There's a similar answer to a (somewhat) similar question. TL;DR: don't use unit testing framework for functional / integration testing.
This is area of functional testing and there is a fabulous framework
called Behat. You should do your own research, but essentially, while
PHPUnit is great at testing more or less independent blocks of
functionality it sucks at testing bigger things like full request
execution. Later you will start experiencing issues with session
errors, misconfigured environment, etc., all because each request is
supposed to be executed in it's own separate space and you force it
into doing the opposite. Behat on the other hand works in a very
different way, where for each scenario (post robot, view non-existing
page), it sends a fresh request to the server and checks the result.
It is mostly used for final testing of everything working together by
making assertions on the final result (response object / html / json).
If you want to test your code the proper way consider using the right tools for that. Once you know your way around with Behat you'll fall in love with it + you can use PHPUnit from within the Behat, to make individual assertions.

Should I be unit testing every piece of code

I have been starting unit testing recently and am wondering, should I be writing unit tests for 100% code coverage?
This seems futile when I end up writing more unit testing code than production code.
I am writing a PHP Codeigniter project and sometimes it seems I write so much code just to test one small function.
For Example this Unit test
public function testLogin(){
//setup
$this->CI->load->library("form_validation");
$this->realFormValidation=new $this->CI->form_validation;
$this->CI->form_validation=$this->getMock("CI_Form_validation");
$this->realAuth=new $this->CI->auth;
$this->CI->auth=$this->getMock("Auth",array("logIn"));
$this->CI->auth->expects($this->once())
->method("logIn")
->will($this->returnValue(TRUE));
//test
$this->CI->form_validation->expects($this->once())
->method("run")
->will($this->returnValue(TRUE));
$_POST["login"]=TRUE;
$this->CI->login();
$out = $this->CI->output->get_headers();
//check new header ends with dashboard
$this->assertStringEndsWith("dashboard",$out[0][0]);
//tear down
$this->CI->form_validation=$this->realFormValidation;
$this->CI->auth=$this->realAuth;
}
public function badLoginProvider(){
return array(
array(FALSE,FALSE),
array(TRUE,FALSE)
);
}
/**
* #dataProvider badLoginProvider
*/
public function testBadLogin($formSubmitted,$validationResult){
//setup
$this->CI->load->library("form_validation");
$this->realFormValidation=new $this->CI->form_validation;
$this->CI->form_validation=$this->getMock("CI_Form_validation");
//test
$this->CI->form_validation->expects($this->any())
->method("run")
->will($this->returnValue($validationResult));
$_POST["login"]=$formSubmitted;
$this->CI->login();
//check it went to the login page
$out = output();
$this->assertGreaterThan(0, preg_match('/Login/i', $out));
//tear down
$this->CI->form_validation=$this->realFormValidation;
}
For this production code
public function login(){
if($this->input->post("login")){
$this->load->library('form_validation');
$username=$this->input->post('username');
$this->form_validation->set_rules('username', 'Username', 'required');
$this->form_validation->set_rules('password', 'Password', "required|callback_userPassCheck[$username]");
if ($this->form_validation->run()===FALSE) {
$this->load->helper("form");
$this->load->view('dashboard/login');
}
else{
$this->load->model('auth');
echo "valid";
$this->auth->logIn($this->input->post('username'),$this->input->post('password'),$this->input->post('remember_me'));
$this->load->helper('url');
redirect('dashboard');
}
}
else{
$this->load->helper("form");
$this->load->view('dashboard/login');
}
}
Where am I going so wrong?

In my opinion, it's normal for test code to be more than production code. But test code tends to be straightforward, once you get the hang of it, it's like a no brainer task to write tests.
Having said that, if you discover your test code is too complicated to write/to cover all the execution paths in your production code, that's a good indicator for some refactoring: your method may be too long, or attempts to do several things, or has so many external dependencies, etc...
Another point is that it's good to have high test coverage, but does not need to be 100% or some very high number. Sometimes there are code that has no logic, like code that simply delegates tasks to others. In that case you can skip testing them and use #codeCoverageIgnore annotation to ignore them in your code coverage.

In my opinion its logical that test are much more code because you have to test multiple scenarios, must provide test data and you have to check that data for every case.
Typically a test-coverage of 80% is a good value. In most cases its not necessary to test 100% of the code because you should not test for example setters and getter. Dont test just for the statistics ;)

The answer is it depends, but generally no. If you are publishing a library then lots of tests are important and can even help create the examples for the docs.
Internal projects you would probably want to focus your code around complex functions and things which would be bad if they were to go wrong. For each test thing what is the value in having the test here rather than in the parent function?
What you would want to avoid is testing too much is anything that relies on implementation details, or say private method/functions, otherwise if you change the structure you'll find you'll have to repeatedly rewrite the entire suite of tests.
It is better to test at a higher level, the public functions or anything which is at the boundary between modules, a few tests at the highest level you can should yield reasonable converge and ensure the code works in the way that your code is actually called. This is not to say lower level functions shouldn't have tests but at that level it's more to check edge cases than to have a test for the typical case against every function.
Instead of creating tests to increase coverage, create tests to cover bugs as you find and fix them, create tests against new functionality when you would have to manually test it anyway. Create tests to guard against bad things which must not happen. Fragile tests which easily break during refactors should be removed or changed to be less dependant on the implementation of the function.

how to fix CConsoleApplication.user undefined in a command class in yii framework?

I am having that error , whenever I ran my simple cron script in shell ,
any idea how to fix that thing ?, from the error itself, it says the .user is undefiend,
when I placed the
'user' => array(
// enable cookie-based authentication
'allowAutoLogin' => true,
'loginUrl' => array('myaccount/blah/login'),
in the console config, it is looking for a "Class" ,, what class am i supposed to include in that array? , this user login url is using an LDAP stuff in loggin in and authentication, what should I do ?

A CConsoleApplication is for handling offline tasks. For example, the application starts a cronjob within a linux system. The application checks each day if a registered user of your website should change his password, because it must be changed every 3 month. If the password expired send an email.
To preselecting the users you have set a scope condition to check the status of a user, as well as a scope to restricted signed in users on her own data:
public function scopes(){
return array(...,
'user' => array(
'condition'=>'id='.Yii::app()->user->id,
),
'active' => array(
'condition'=>'status='.self::STATUS_ACTIVE,
), ...
);
}
Now, in your CCA-Code you use the active scope to get all users:
$usersArray = User::model()->active()->findAll(); ...foreach.... The Problem here is in the use of the extended class, the CActiveRecord-class. Mostly used as a class extension in models, which are stored in a database. In this CActiveRecord-class the CActiveRecord->__call function is used to get all stored scopes of a model. After that the class merged the actually requested scopes with the rest of the database criteria. The fact that all scopes are loaded first occures the error in loading the user-scope, include Yii::app()->user->id. The WebUser is called and throws the exception 'CException' with message 'attribute "CConsoleApplication.user is not defined'. You wouldn't call the WebUser, but the automatism arrange this for you :-)
So, do it like schmunk says. Generate in your scope code an exception part where ensures that Yii::app()->user is not called:
public function scopes(){
if (Yii::app() instanceof CConsoleApplication) {
$user = array(''); //no condition
}else{
$user = array(
'condition'=>'id='.Yii::app()->user->id,
);
}
return array(
'user' => $user,
'active' => array(
'condition'=>'status='.self::STATUS_ACTIVE,
), ...
);
}
I hope the explanation helps and perhaps also for other problems.

Short answer: You can't use CWebUser in console application. Don't include it in your config/console.php
Long(er) answer: If you rely on a component, which needs CWebUser, you'll have to detect this in the component and create some kind of workaround for this case. Have a look at this code piece for an example how to detect, if you're running a console app.

Try this
public static $console_user_id;
public function init() {
if (Yii::app() instanceof CConsoleApplication) {
if (self::$console_user_id) $this->id = self::$console_user_id;
return false;
}
parent::init();
}

solved my problem by using update, instead of save in the script...no need to use user array and CWebUser class

I had the same problem. Screened all answers given here and found some good point, but solved my problem my way, although it may not be the best.
First off all I had to figure out that my Cron Jon threw the aforementioned Exception because inside the Cron job I was running a script which had this part of code in it
if(Yii::app()->user->isAdmin()) {...} else {...}
So the console threw the error since the user was not defined. I changed my code in such a way that I tested if the console was running it. The changed code is as follows:
$console = false;
try {
$test = Yii::app()->user->isAdmin();
}
catch (CException $e) {
$console = true;
}
if($console || (!$console && Yii::app()->user->isAdmin()) {...} else {...}
As said, not perfect, but maybe a solution for someone.

How to design error reporting in PHP

How should I write error reporting modules in PHP?
Say, I want to write a function in PHP: 'bool isDuplicateEmail($email)'.
In that function, I want to check if the $email is already present in the database.
It will return 'true', if exists. Else 'false'.
Now, the query execution can also fail, In that time I want to report 'Internal Error' to the user.
The function should not die with typical mysql error: die(mysql_error(). My web app has two interfaces: browser and email(You can perform certain actions by sending an email).
In both cases it should report error in good aesthetic.
Do I really have to use exception handling for this?
Can anyone point me to some good PHP project where I can learn how to design robust PHP web-app?

In my PHP projects, I have tried several different tacts. I've come to the following solution which seems to work well for me:
First, any major PHP application I write has some sort of central singleton that manages application-level data and behaviors. The "Application" object. I mention that here because I use this object to collect generated feedback from every other module. The rendering module can query the application object for the feedback it deems should be displayed to the user.
On a lower-level, every class is derived from some base class that contains error management methods. For example an "AddError(code,string,global)" and "GetErrors()" and "ClearErrors". The "AddError" method does two things: stores a local copy of that error in an instance-specific array for that object and (optionally) notifies the application object of this error ("global" is a boolean) which then stores that error for future use in rendering.
So now here's how it works in practice:
Note that 'Object' defines the following methods: AddError ClearErrors GetErrorCodes GetErrorsAsStrings GetErrorCount and maybe HasError for convenience
// $GLOBALS['app'] = new Application();
class MyObject extends Object
{
/**
* #return bool Returns false if failed
*/
public function DoThing()
{
$this->ClearErrors();
if ([something succeeded])
{
return true;
}
else
{
$this->AddError(ERR_OP_FAILED,"Thing could not be done");
return false;
}
}
}
$ob = new MyObject();
if ($ob->DoThing())
{
echo 'Success.';
}
else
{
// Right now, i may not really care *why* it didn't work (the user
// may want to know about the problem, though (see below).
$ob->TrySomethingElse();
}
// ...LATER ON IN THE RENDERING MODULE
echo implode('<br/>',$GLOBALS['app']->GetErrorsAsStrings());
The reason I like this is because:
I hate exceptions because I personally believe they make code more convoluted that it needs to be
Sometimes you just need to know that a function succeeded or failed and not exactly what went wrong
A lot of times you don't need a specific error code but you need a specific error string and you don't want to create an error code for every single possible error condition. Sometimes you really just want to use an "opfailed" code but go into some detail for the user's sake in the string itself. This allows for that flexibility
Having two error collection locations (the local level for use by the calling algorithm and global level for use by rendering modules for telling the user about them) has really worked for me to give each functional area exactly what it needs to get things done.

Using MVC, i always use some sort of default error/exception handler, where actions with exceptions (and no own error-/exceptionhandling) will be caught.
There you could decide to answer via email or browser-response, and it will always have the same look :)

I'd use a framework like Zend Framework that has a thorough exception handling mechanism built all through it.

Look into exception handling and error handling in the php manual. Also read the comments at the bottom, very useful.
There's aslo a method explained in those page how to convert PHP errors into exceptions, so you only deal with exceptions (for the most part).

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.