I have a feature file in Behat (below) where I define the table headings. I have been using getRowsHash() to get the table headings and it has been working fine.
| TableHeadings |
| FlagIcon |
| Flight |
| Stand |
| From |
But just recently while testing a page with 18 headings, it started failing. I could't get any answers. So thought of trying getHash() instead and it worked fine.
Is there a limitation with getRowsHash() beyond 15 rows or should I be using getRows() or getHash() instead.
Note: If I use getRowsHash(), I get an error that expected (15) is not equal to Actual (18). As I mentioned above I have expected (18 headings not 15)
There's no such limitation, see it for yourself: https://github.com/Behat/Gherkin/blob/master/src/Behat/Gherkin/Node/TableNode.php#L92
There must be a mistake on your side. You gave too little details to judge where it is exactly.
Are your scenarios still readable with so much details in them? I'd consider putting only relevant details into your scenarios and hide the rest in a context file.
Related
I have the following table structure:
Table name: avail
id (autoincremetn) | acc_id | start_date | end_date
-------------------------------------------------------
1 | 175 | 2015-05-26 | 2015-05-31 |
-------------------------------------------------------
2 | 175 | 2015-07-01 | 2015-07-07 |
-------------------------------------------------------
It's used for defining date range availability eg. all dates in between start_date and end_date are unavailable for the given acc_id.
Based on user input I'm closing different ranges but I would like to throw an error IF an user tries to close (submit) a range that has it's start OR end_date somewhere in the range of an already existing one (for the submitted acc_id) in the DB.
In this example a start_date: 2015-05-30 end_date: 2015-06-04 would be a good fail candidate.
I've found this QA:
MySQL overlapping dates, none conflicting
that pretty much explains how to do it in 2 steps, 2 queries with some PHP logic in between.
But I was wondering if it can be done in one insert statement.
I would eventually check for rows affected for success or fail (sub question: is there a more convenient way to check if it failed for some other reason besides date overlap?)
EDIT:
In response to Petr's comment I'll specify further the validation:
any kind of overlapping should be avoided, even the one embracing the
whole range or finding itself inside the existing range. Also, if
start or end dates equal the existing start or end dates it must be
considered an overlap. Sometimes certain acc_id will already have more
than one rang in the table so the validation should be done against
all entries with a given acc_id.
Sadly, using just MySQL this is impossible. Or at least, practically. The preferred way would be using SQL CHECK constraints, these are in the SQL language standard. However, MySQL does not support them.
See: https://dev.mysql.com/doc/refman/5.7/en/create-table.html
The CHECK clause is parsed but ignored by all storage engines.
It seems PostgreSQL does support CHECK constraints on tables, but I'm not sure how viable it is for you to switch database engine or if that's even worth the trouble just to use that feature.
In MySQL a trigger could be used to solve this problem, which would check for overlapping rows before the insert/update occurs and throw an error using the SIGNAL statement. (See: https://dev.mysql.com/doc/refman/5.7/en/signal.html) However, to use this solution you'd have to use an up-to-date MySQL version.
Apart from pure SQL solutions, this typically is done in application logic, so whichever program is accessing the MySQL database typically checks for these kind of constraints by requesting every row that is violated by the new entry in a SELECT COUNT(id) ... statement. If the returned count is larger than 0 it simply doesn't to the insert/update.
I am starting to think about my new project and I've found a couple of speed issues, so I hope you can help me with selecting a good and elegant way to code it.
Each user has in the database records of "places" he has visited. Each place has "schools" - a number of schools in this particular place. Each school has classes. Each class may end its "learning year" at different times, so it's number should increment if date is >= end of learning year.
So we have such a database:
"places" table:
place | user_id |
-----------------
1 | 4 |
2 | 4 |
User no 4 visited place no 1 and 2
"schools" table:
school | place |
----------------
5 | 2 |
6 | 2 |
Place 2 has two schools - with id 5 and 6.
"class" table:
class | school | end_learning | class_number
---------------------------------------------
20 | 5 | 01.01.2013 | 2
21 | 5 | 03.01.2013 | 3
22 | 5 | 05.01.2013 | 4
School 5 has 3 classes with ids 20, 21, 22. If date is greater than 01.01.2013, the class number of class 20 should be incremented to 3 and end learning date changed to 01.01.2014. And so on.
And now we got into the problem - if there is 1000 places, each with 100 schools, each with 10 classes we got 1000000 records. It's a lot. Because all I have presented is just a simple example I have to consider updating whole database every time user refreshes the page so I'm afraid it might be laggy on that amount of records.
I also can serialize class into one field in school table:
school | place | classes
-------------------------------------------------------------------------
5 | 2 | serialized class 20, 21, 22 with end_learning field and class number
6 | 2 | other serialized classes from school 6
In that case I get 10 times less records but each time I have to deserialize data, check dates and if it's less than now alter it, serialize and save to database. The second problem is that I have to select all records from db to manipulate them not only all those need to be altered.
I am also thinking about having two databases: One with records that might need change in further future, and second that might need change in next 24hrs (near future). Every 24hrs all the classes which end learning in next 24 hrs are moved to "near future" db so every refresh of the page works on thousands of records, not hundreds of thousands or millions. Instead of that it works on millions of records (further future) to create "near future" table only once per day.
What do you think about all those database schemas? Maybe you have a better idea?
I don't quite understand the business logic or data model you outline - but I will assume you have thought this through.
Firstly, RDBMS solutions like MySQL are really, really good at managing large numbers of records, as long as the data you are working with is relational. As far as I can tell, you will be searching across many records, but only updating a few (a user will only be enrolled in a limited number of classes); I don't see this as a huge problem.
Secondly, it's nearly always better to go with the "standard" relational model until you can prove it doesn't meet your performance needs than to go for "exotic" solutions at the start off (I class your serialization and partitioning solution as "exotic" for the purpose of this answer). A lot of time and energy has gone into optimizing performance of SQL; if there were a simple alternative, it would be part of the standard solution. There are, of course, points at which the standard relational model doesn't scale (Facebook-size traffic, for instance), or business domains where the relational model doesn't really fit (documents, graphs). However, all the alternatives have benefits and drawbacks just like "standard" MySQL.
Thirdly, the best way to deal with possible performance issues is, well, to deal with them. In code. Build a test rig, create a schema according to the relational model, populate it with test data (e.g. using DbMonster), throw some load at it (e.g. using JMeter) and tune your schema and queries to prove your situation doesn't fit the standard solution. Only go for something exotic if you really can prove that you can't play nice with standard, relational database stuff.
This question already has answers here:
Insert, on duplicate update in PostgreSQL?
(18 answers)
Closed 1 year ago.
I asked this last night, and got information on merging (which is unavailable in postgresql). I'm willing to try the workaround suggested But I'm just trying to understand why it can't be done with conditional logic.
I've clarified the question a bit, so maybe this will be easier to understand.
I have a query that inserts data into a table. But it is creating a new record every time. Is there a way I can check if the row is there first, then if it is, UPDATE, and if it isn't INSERT?
$user = 'username';
$timestamp = date('Y-m-d G:i:s.u');
$check_time = "start"; //can also be stop
$check_type = "start_user"; //can also be stop_user
$insert_query = "INSERT INTO production_order_process_log (
production_order_id, production_order_process_id, $check_time, $check_type)
VALUES (
'$production_order_id', '$production_order_process_id', '$timestamp', '$user')";
The idea is that the table will record check-in and check-out values (production_order_process_log.start and production_order_process_log.stop). So before a record with a check-out time stamp is made, the query should check to see if the $production_order_process_id already exists. if it does exist, then the timestamp can go into stop and the $check_type can be stop_user. Otherwise, they can stay start and start_user.
I am basically trying to avoid this result.
+----+---------------------+--------------------------------+--------------------+-------------------+-------------+-------------+
| id | production_order_id | production_order_process_id | start | stop | start_user | stop_user |
+----+---------------------+--------------------------------+--------------------+-------------------+-------------+-------------+
| 8 | 2343 | 1000 | 12 july 03:23:23 | NULL | tlh | NULL |
+----+---------------------+--------------------------------+--------------------+-------------------+-------------+-------------+
| 9 | 2343 | 1000 | NULL | 12 july 03:45:00 | NULL | tlh |
+----+---------------------+--------------------------------+--------------------+-------------------+-------------+-------------+
Many thanks for helping me suss out the postgresql logic to do this task.
This question and answer will be of interest to you: Insert, on duplicate update in PostgreSQL?
Basically, either use two queries (do the select, if it's found update, otherwise insert), which is not the best solution (two scripts running simultaneously could give duplicate inserts), or do as the above questions suggests - make a stored procedure/function to do it (this is probably the best option, and easiest).
Recognizing the nature of your workflow, it seems that an order can not stop before or at the same time as it starts, right? And it had to have started in order to stop, right? Please correct me if I'm wrong.
So you could just check whether it's a start operation and do an INSERT in that case, or stop operation and do an UPDATE.
I feel like concurrency doesn't really come into play here.
Since it looks like recursive queries aren't possible in MySQL, I am wondering if there is a solution to get the same information that also limits the number of queries I make to the database. In my case I have what amounts to a tree and given a node, I make a path back to the root and save the name of the nodes as I go. Given a table like this:
id | parent
-------------
1 |
2 | 1
3 | 1
4 | 2
5 | 2
6 | 5
I want to select all ids on the path from 6 back to 1 (6,5,2,1). Since the total length of the path is unknown I would assume that the only way to do this is taking the results from one query and build a new query until I am back at the root. Then again it has been a couple years since I last used MySQL so it wouldn't surprise me if I am a little out of touch. Any help would be appreciated.
Since it looks like recursive queries aren't possible in mySQL
mySQL doesn't support the 'CONNECT BY' operator, true - but you can implement recursive procedures/functions using mysql and return result sets from them.
Site.
This code is not optimized nor is it the best method. If you have any ideas to improve anything, let me know.
Please visit the site to get an idea of the data.
I have used Dan G. Switzer, II's calculation plugin [adding .sum() .max() .min() .avg()]
The validation requirement i'd like to have is to make sure nothing conflicts with that user's already determined range. Also, that there are no gaps in the range.
For example give
Brian 40 50 1200
Brian 50 70 1200
I dont want to the user to be able to set the first 50 to 39 because 39 would be smaller than 40. I dont want them to let 50 be set to anything higher than 50 because it would overlap the next range.
Any good ideas? perhaps actually running through all the values and then making a real range and then on BLur have it check to make sure no range is actually overlapped or gapped.
Each unique input is defined as id=NAME, so if I wanted to reference all the inputs of Brian, I could use $("input[id='Brian']").each() or if I wanted to reference all the START inputs of Brian, I can use $("input[id='Brian'][name='start[]'").each()
Edit:
One thing to note is that the page is PHP, and PHP is ran to populate the inputs via a CSV file. It will always start with correct data, and PHP can be used to help create ranges.
because of this I was thinking of just disabling the START field, because it will always populate the next range. However, I will be adding the ability to delete rules, so that can get messy if they are limited in what they can do.
One of the problems I see is that the Name fields is editable. If the field is changed, all the overlapping has to be recalculated. Not only is that is a performance issue, it is also a usability problem: what if one wants to change a name and the script, seeing discrepancies, forbids it?
A solution would be to change the table to one resembling the diagram below:
[ Add Employee ]
| Name | Start | End | Wages |
==========================================
| | 0 | 50 | 100 |
| Brian | 50 | 70 | 150 |
| | 100 | 150 | 200 |
| [ Add Rule ] |
------------------------------------------
| Another | 0 | ... | etc |
The [ Add Employee ] button would ask for a name and add a cell in the first column, and a line in the next three: with a default Start value of 0. The user can then enter data on the next fields, and can [ Add Rules ] as wanted. A good restriction could be to lock the values of the Start column and set them to the End value of the previous line.
$('#Brian .LineN input.start').val(
$('#Brian .LineN').prev().find('input.end').val()
);
Then adding a comparison function to check if the End value is lower than the Start value is trivial to implement.
Deleting a rule could simply follow the same procedure as for adding a line: the line below the one deleted (if any) would set its Start value as the now previous line's End.
The real difficulty would be to insert lines at arbitrary points. I'm not going to think about that one, though.
EDIT:
If a rewrite is out of the question, then just disabling the start should be enough. However, IMHO I would rather (re)write something with no caveats than spend time later on numerous bugs and feature requests.