I'm trying to figuriung out a way to manage quotas over a CATI system (written in PHP+SQL and XML)
let's say we have a population like this:
CITY | #MALE | #FEMALE | AGE CLUSTER (YOUNG) | AGE CLUSTER (OLD)
NY 200 250 350 100
LA 300 350 250 400
Then we have the db containing all the ppl to be interviewed:
(name, city, sex, age cluster, telphone)
this db will not be necessarely representative of the first table, we have to consider also wrong tel number and any other sort of situation that may force us to drop a record and pass forward.
So, how we can achieve a good quota management at the end of the campaign? What's the best approach? It would be great, also, to maintain quotas over the time: let's say my campign'll last 1 year, I would like to perform a checkpoint at the end of the first 2 month and discover that quotas are ok...
The queXS software (I am the author) implements quotas for telephone interviewing (it calls them row quotas). The code is available here.
Have a look at the admin/rowquota.php file and the functions/functions.operator.php file.
Basically what occurs is:
Setup:
You have a list of people to be interviewed (sample) as you describe
There should be 2 lists, split by area (LA, NY)
Each sample would have a quota of Males, Females, and Age cluster Young/Old
Running:
The system records the outcomes of contacts to each number
Where the outcome is "completed" the system finds all quotas that are fulfilled by that record and adds to the quota
Where the quota is reached - all records that match the query (e.g. Males in LA) will be excluded
Describing the code here would be a bit tedious as a lot of the code is specific to the database setup of the system, but if you require further explanation please let me know.
Related
I am building a small system to control my rental properties. In this system, I want to be able to capture any regulations that can happen to a lease. A regulation can happen for:
The rent amount
The prepaid rent amount
The prepaid deposit amount
I am thinking on how I can structure this. I have a Lease model with leases table (simplified):
id | name | address | rent | prepaid_rent_amount | prepaid_deposit_amount
1 | Unit no. 1 | Highland Drive | 1000 | 3000 | 3000
In order to capture if any regulation has happened on the above three parameters (rent amount, prepaid rent or prepaid deposit), I need to store this in a database. The information needed to be capture from these parameters are almost identical:
//Regulation in rent
- lease_id
- amount
- date
- flag
//Regulation in prepaid rent
- lease_id
- amount
- date
//Regulation in prepaid deposit
- lease_id
- amount
- date
Solution 1
I was thinking to create a model/table for each parameter, and simply just store data relevant for each regulation in the corresponding table:
RentRegulation
PrepaidRentRegulation
PrepaidDepositRegulation
I think the benefits of this solution is that the data is separated, however it feels like it is bloating my codebase.
Solution 2
Instead of creating three models/tables to store the above information in, I was thinking to just create one model:
Regulation
An inside this model, simply have a column called type. The value for this column would then control what kind of regulation we were talking about (e.g. RentRegulation, PrepaidRentRegulation etc.)
The only concern I have for solution #2, is for the type RentRegulation there can be different types of regulations. So a rent can be regulated due to the following stuff (all regulated by law):
Increase in inflation
Renovating the lease
Agreement with the tenant that rent should increase
To cater for the above, I was thinking of also adding a nullable flag column to the regulations table, and then just rely on scopes when retreiving data?
I have some problems in my search function. When some user type the sentences in search field I want to get the result from the keywords inside the sentence which user type before. For example I have database table like this:
ID | Keywords | Answer
-----------------------------------------------------------------------------
1 | price, room | The price room is $150 / night
2 | credit card | Yes, you could pay with credit card
3 | location | The Hotel location is in the Los Angeles
4 | how to, way to, book | You could pay with credit card or wire transfer
5 | room, size | The room size is 50sqm
And this is the examples of sentences which user input:
What is the room price ?
From that sentences the system will find the keywords inside the senteces in that case the keywords is room and price.
And from that keywords the systems will show the answer is The price room is $150 / night
Can I pay with credit card ?
From that sentences the system will find the keywords inside the sentences in that case the keywords is credit card.
And from that keywords the systems will show the answer is Yes, you could pay with credit card
What is the room size ?
From that sentences the system will find the keywords inside the sentences in that case the keywords is room and size.
And from that keywords the systems will show the answer is The room size is 50sqm
The example 1 and 3 has room in the sentences. I would also want to know that the keywords is room price and room size.
How could I find the keywords from the sentences which user already input ?
How to I get the answer from database with that keywords ?
From that examples I want to know how could I to do that with PHP and MySql ? Or maybe there is some way to build that ? Please anybody knows to do this could help me. Thanks before.
I would suggest not to store keywords separating with commas in single row, instead insert them in different rows. Because when you will try to search any text which is in keywords it will always check for credit card or price, room. It will not consider price and room as different words instead it will consider this as string.
For your question, try following code :
$que = 'What is the room price';
$keywords = str_replace(" ", ",", $que);
$sql = 'select answer from your_table where keywords IN (' . $keywords . ')';
OR you can try for FIND_IN_SET() to search comma separated keywords.
It may work.
My approach would be to use the concept of STOP WORDS remove all STOP WORDS from the user query.
Then only to search for ALL the KEY WORDS in the user query.
DATA entry needs to remove most of the users data to be robust. What if they intend to break your system by inserting CODE.
STOP WORDS include 'the' 'a' 'of'
The idea is to remove as much rubbish as you can and then to be very picky about other words.
Log the query data for inspection in case of failure.
log the ACCESS data that you think you are processing
and then set a timeout on the response time.
eg. if you know that the query should only ever take
X ms. Then anything that takes longer than that is suspect. It could have gotten past your protective layer. DO make sure you log the IP address and timestamp in the log files - preferably right at the start of the log entry.
Then write scripts for handling a SLICE.
A SLICE is a nice way to help the system administrators
who may have to send you a slice of the log files.
The slice can be complicated - from DAY (YYYYMMDDmm.s) to another DAY and they may have had an overnight compression system running - so your script needs to access normal log files and compressed log files. Sometimes the files are split up by system failures - ie. the system died for some reason.
Your SLICE info can be packaged up into an email etc. and sent to you for analysis.
Good luck.
With all of the daily fantasy games out there, I am looking to see if I can easily implement a platform that will help identify the optimal lineup for a fantasy league based on a salary cap and projected points for each player.
If given a pool of ~500 players and you need to find the highest scoring lineup of within the maximium salary cap restraints.
1 Quarter Back
2 Running Back
3 Wide Receiver
1 Tight End
1 Kicker
1 Defense
Each player is assigned a salary (that changes weekly) and I will assign projected points for those players. I have this information in a MySQL DB and would prefer to use PHP/Pear or JQuery if that's the best option for calculating this.
The Table looks something like this
player_id name position salary ranking projected_points
1 Joe Smith QB 1000 2 21.7
2 Jake Plummer QB 2500 6 11.9
I've tried sorting by projected points and filling in the roster, but it obviously will provide the highest scoring team, but also exceeds the salary cap. I cannot think of a way to have it intelligently remove players and continue to loop through and find the highest scoring lineup based on the salary constraints.
So, is there any PHP or Pear class that you know of that will help "Solve" this type of problem? Any articles you can point me to for reference? I'm not asking for someone to do this, but I've been Googleing for a while and the best solution I currently have is this. http://office.microsoft.com/en-us/excel-help/pick-your-fantasy-football-team-with-solver-HA001124603.aspx and that's using Excel and limited to 200 objects.
I'll suggest two approaches to this problem.
The first is dynamic programming. For brute force, we could initialize a list containing the empty partial team, then, for each successive player, for each partial team currently in the list, add a copy of that partial team with the new player, assuming that this new partial team respects the positional and budget constraints. This is an exponential-time algorithm, but we can reduce the running time by quite a lot (to O(#partial position breakdowns * budget * #players), assuming that all monetary values are integer) if we throw away all but the best possibility so far for each combination of partial position breakdown and budget.
The second is to find an integer programming library callable from PHP that works like Excel's solver. It looks like (e.g.) lpsolve comes with a PHP interface. Then we can formulate an integer program like so.
maximize sum_{player p} value_p x_p
subject to
sum_{quarterback player p} x_p <= 1
sum_{running back player p} x_p <= 2
...
sum_{defense player p} x_p <= 1
sum_{player p} cost_p <= budget
for each player p, x_p in {0, 1} (i.e., x_p is binary)
I'm trying to figure out how to build a specific algorithm (ultimately implemented in PHP, but that's less important), but I'm having a hard time wrapping my head around the best way to do the math. Instead of defining a complex industry-specific process, I'll use a crazy metaphor here (the math is what's important). Imagine you're trying to identify the percent chance a specific make of car is parked in a store's parking lot based on the items sold within the store. To begin you take a physical survey of 100,000 store parking lots, recording each unique car make spotted outside, each unique item sold within the store, and a fixed percent relevance that item has to the store (ex: lumber has an 89% relevance to Home Depot, but pencils only have a 23% relevance to Walmart).
There are two parts to what I’m trying to solve. First, I’m trying to figure out the best way to roll-up this data to a specific item, while respecting each relevance percent and the number of confirmed observations (so one spotting doesn’t equal 100% chance, similar to http://www.evanmiller.org/how-not-to-sort-by-average-rating.html ). In other words, if a brand new, never-before-seen store is selling Waterford glasses and cashmere sweaters, from those items we can predict there’s an 89% chance a Mercedes is in the parking lot.
So to recap:
Each item has been seen a specific number of times in a store. For each of those times, there is a different product/store relevance percentage and a list of all car makes in the parking lot. How do I best mathematically calculate the percent chance a specific make is in the parking lot of a brand new store, only based on the items within?
Now the second part of this is getting a bit more complicated by adding another layer of abstraction. If a single person visits 50 stores, and we aggregate all the items in all those stores, we can predict what type of car they drive (ex: lots of camping and hiking stores, so they have a 67% chance of driving a Jeep). Then if they visit a new store and are exposed to a brand new item, for which we have no data, I need to apply that 67% Jeep onto the new item (still respecting the relevance of that item to the store). Then use that item’s less-than-certain Jeep statistic to influence our predictions of parking lots that contain that new item (which was never directly measured). Perhaps this requires us to add a confidence interval of some kind? Or how can we represent that uncertainty, without every one of the millions of items we analyze eventually averaging out to 50%?
I REALLY appreciate your help on this!
I think, you need to build cross-correlation matrix,
where lines are goods, and columns are car types.
Each cell contains normalized coefficient, how to some
good (i.e. diamond ring) is related to car type (Geo or Mercedes).
Details see here:
http://en.wikipedia.org/wiki/Cross-correlation
Greetings,
I am planning a PHP and MySQL based app that will help in locating the correct page for a particular address in a map book. The map book uses a high and low address range to place sections of a street or highway in the book, each section with its own page (or sub-section of a page).
The user will enter the street address, house number separate from street name, and the desired result is to print details including the map page. What would be the best way to determine the corresponding high and low range based on the house number given in MySQL?
The table will be similar to this:
id, street_name, low_address, high_address, map_page
An example entry would be:
1, Elm Street, 1, 100, 30
Thanks!
Hate it when I end up posting my own solution, but MySQL's BETWEEN function fit the bill for this. Here is what i ended up with:
MySQL Prepared Statement
SELECT * FROM dispatch WHERE streetName = ? AND ? BETWEEN lowAddress AND highAddress LIMIT 1
PHP
$getInfo->execute(array($streetName, $houseNum));