Data migration into CiviCRM - keep legacy IDs

Data migration into CiviCRM - keep legacy IDs - php

I develop custom migration code using CiviCRM's PHP API calls like:
<?php
$result = civicrm_api3('Contact', 'create', array(
'sequential' => 1,
'contact_type' => "Household",
'nick_name' => "boo",
'first_name' => "moo",
));
There's a need to keep original IDs, but specifying 'id' or 'contact_id' above does not work. It either does not create the contact or updates an existing one.
The ID is auto-incremented, for sure, but MySQL supports to insert arbitrary, unique values in that case.
How would you proceed? Hack CiviCRM to somehow pass the id to MySQL at the INSERT statement? Somehow dump the SQL after the import and manipulate the IDs in-place at the .sql textfile (hard to maintain integrity)? Any suggestions for that?
I have ~300.000 entries at least to deal with, so a fully automated and robust solution is a must. Any SQL magic potentially to do that?
For those who are not familiar with CiviCRM, the table structure is the following:
mysql> desc civicrm_contact;
+--------------------------------+------------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------------------+------------------+------+-----+-------------------+-----------------------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| contact_type | varchar(64) | YES | MUL | NULL | |
| contact_sub_type | varchar(255) | YES | MUL | NULL | |
| do_not_email | tinyint(4) | YES | | 0 | |
| do_not_phone | tinyint(4) | YES | | 0 | |
| do_not_mail | tinyint(4) | YES | | 0 | |
| do_not_sms | tinyint(4) | YES | | 0 | |
| do_not_trade | tinyint(4) | YES | | 0 | |
| is_opt_out | tinyint(4) | NO | | 0 | |
| legal_identifier | varchar(32) | YES | | NULL | |
| external_identifier | varchar(64) | YES | UNI | NULL | |
and we talk about the first field.

You should use the external_identifier field which is exactly done for what you want.
This field is not used by CiviCRM itself so there is no risk to mess with core functionality. It's done to link with an external system (legacy for example).
CiviCRM consider the external_identifier to be unique so it will throw an error (using API - I think) or update (using CiviCRM contact import screen) if you try to insert a contact with the same external_identifier.

Related

Reports in Laravel 5.2

I want to generate a report from a table, like
+-------------+------------------+------+-----+------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------------+------+-----+------------+----------------+
| productID | int(10) unsigned | NO | PRI | NULL | auto_increment |
| productCode | char(3) | NO | | | |
| name | varchar(30) | NO | | | |
| quantity | int(10) unsigned | NO | | 0 | |
| price | decimal(7,2) | NO | | 99999.99 | |
+-------------+------------------+------+-----+------------+----------------+
and show with some graphic the the top sellers. I'm lost in this subject.
Is there a package that make this reports?
Thanks for the info in advance.

I don't think there is a package to generate the reports. Reports are all about getting data from DB, analyze and send output to the client/browser. What I would suggest is that get the data from DB and send to the client as JSON. In client side, you can use graph plotting packages like Highchart, D3JS etc to plot the graph.

How can a simple MySQL insert/update be slower than an external web request?

Due to some performance issues I've been optimizing several SQL queries and adding indexes to certain tables/columns to speed up things.
Been running some time tests using microtime() in PHP (looping the queries a couple hundred times and calling RESET QUERY CACHE in each loop). I'm somewhat baffled by the results from one of the functions that does 3 things:
Inserts a row in a sessions table (InnoDB).
Updates a row in a users table (InnoDB).
Sends session ID to remote server which inserts the session ID in a session table of it's own (MongoDB).
Step 1. generally takes 30 - 40 ms, step 2. 20 - 30 ms and step 3. 7 - 20 ms.
I've tried looking up some expected query times for MySQL, but haven't found anything useful, so I don't know what to expect. Having said that, those query times seem somewhat high and I would definite not expect the web request to finish faster than the MySQL queries to the local database.
Any idea if those query times are reasonable compared to the web request?
SQL/system information
Both servers (the remote and the one with the MySQL database) are virtual servers running on the same physical server with shared storage (multiple SSD raid destup). The remote server has a single CPU and 2 GB RAM assigned, the MySQL server has 8 CPUs and 32 GB RAM assigned. Both servers are on the same LAN.
The sessions insert query:
INSERT INTO sessions (
session_id,
user_id,
application,
machine_id,
user_agent,
ip,
method,
created,
last_active,
expires
)
VALUES (
string, // session_id
int, // user_id
string, // application
string, // machine_id
string, // user_agent
string, // ip
string, // method
CURRENT_TIMESTAMP, // created
CURRENT_TIMESTAMP, // last_active
NULL / FROM_UNIXTIME([PHP timestamp]) // expires
)
The sessions table (contains ~500'000 rows);
+-------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+----------------+
| sessions_id | int(11) | NO | PRI | NULL | auto_increment |
| session_id | char(32) | NO | UNI | NULL | |
| user_id | int(11) | NO | MUL | NULL | |
| application | varchar(128) | NO | | NULL | |
| machine_id | varchar(36) | NO | | NULL | |
| user_agent | varchar(1024) | NO | | NULL | |
| ip | varchar(15) | NO | | NULL | |
| method | varchar(20) | NO | | NULL | |
| created | datetime | NO | | NULL | |
| last_active | datetime | NO | | NULL | |
| expires | datetime | YES | MUL | NULL | |
+-------------+---------------+------+-----+---------+----------------+
The users update query:
UPDATE users
SET last_active = string // For example '2016-01-01 00:00:00'
WHERE user_id = int
The users table (contains ~200'000 rows):
+------------------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------------+---------------------+------+-----+---------+----------------+
| user_id | int(11) | NO | PRI | NULL | auto_increment |
| username | varchar(64) | NO | MUL | NULL | |
| first_name | varchar(256) | NO | | NULL | |
| last_name | varchar(256) | NO | | NULL | |
| info | varchar(512) | NO | | NULL | |
| address1 | varchar(512) | NO | | NULL | |
| address2 | varchar(512) | NO | | NULL | |
| city | varchar(256) | NO | | NULL | |
| zip_code | varchar(128) | NO | | NULL | |
| state | varchar(256) | NO | | NULL | |
| country | varchar(128) | NO | | NULL | |
| locale | varchar(5) | NO | | NULL | |
| phone | varchar(128) | NO | | NULL | |
| email | varchar(256) | NO | MUL | NULL | |
| password | char(60) | NO | MUL | NULL | |
| permissions | bigint(20) unsigned | NO | | 0 | |
| created | datetime | YES | | NULL | |
| last_active | datetime | YES | | NULL | |
+------------------------+---------------------+------+-----+---------+----------------+

It seems that the problem was simply our MySQL settings (they were all default).
I ran a MySQL profile on the users update query and found that the step query end was taking up the majority of the time spent executing the query.
Googling that led me to https://stackoverflow.com/a/12446986/736247 - rather than using all the suggested values directly (which cannot be recommended, because some of them can have adverse effects on data integrity) I found some more info, including this page on Percona: https://www.percona.com/blog/2013/09/20/innodb-performance-optimization-basics-updated/.
InnoDB Startup Options and System Variables was also useful: http://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html was also useful.
I ended up setting new values for the following settings:
innodb_flush_log_at_trx_commit
innodb_flush_method
innodb_buffer_pool_size
innodb_buffer_pool_instances
innodb_log_file_size
This resulted in significantly shorter query times (measured in the same way as I did in the question):
Insert a row in a sessions table: ~8 ms (down from 30-40 ms).
Update a row in a users table: ~2.5 ms (down from 20-30 ms).

Laravel 5 custom database sessions?

I'm new to Laravel (using 5.1). I have my entire DB schema (MySQL 5.5) diagrammed and have begin implementing it. The problem is, I need to adapt Laravel to use my sessions table. After making a new migration to bring the table more in line with what Laravel expects, I have this table:
+---------------+---------------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------------------------------------+------+-----+---------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| id_hash | varchar(255) | NO | UNI | NULL | |
| user_id | bigint(20) unsigned | NO | | 0 | |
| created_at | int(10) unsigned | NO | | 0 | |
| updated_at | int(10) unsigned | NO | | 0 | |
| expires_at | int(10) unsigned | NO | | 0 | |
| last_activity | int(10) unsigned | NO | | 0 | |
| platform | enum('d','p','t','b','a','i','w','k') | NO | | d | |
| ip_address | varchar(40) | NO | | 0.0.0.0 | |
| payload | text | NO | | NULL | |
| user_agent | text | NO | | NULL | |
+---------------+---------------------------------------+------+-----+---------+----------------+
The main thing I need to accomplish is to have id as an auto-incrementing integer (because my Session model has relationships to other models) and use id_hash as the publicly identifying string (I also plan to cut id_hash back to 64), which I think is the token in the payload.
At session creation, id_hash, platform, ip_address, and user_agent will be set, never to change again. After authentication, user_id will be populated, then cleared at logout.
I'm ok with keeping the payload handling as-is.
Is this just a matter of creating a custom class that implements SessionHandlerInterface? What else needs to be in it for handling my extra fields that's not obvious from the session docs?

Can't solve child parent association in cakephp 3

I have table with a parent child relationship with itself:
mysql> desc features;
+---------------------+-------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+-------------+------+-----+-------------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| featureID | varchar(45) | NO | | NULL | |
| probeID | int(11) | YES | | NULL | |
| shortName | varchar(45) | NO | | NA | |
| start | int(11) | NO | | NULL | |
| stop | int(11) | NO | | NULL | |
| strand | int(11) | NO | | NULL | |
| curatedManually | varchar(45) | NO | | NA | |
| created | timestamp | NO | | CURRENT_TIMESTAMP | |
| descriptions_id | int(11) | YES | MUL | NULL | |
| features_types_id | int(11) | NO | MUL | NULL | |
| chromosomes_id | int(11) | NO | MUL | NULL | |
| species_id | int(11) | NO | MUL | NULL | |
| strains_id | int(11) | NO | MUL | NULL | |
| parents_features_id | int(11) | YES | MUL | NULL | |
+---------------------+-------------+------+-----+-------------------+----------------+
The according fields are parents_features_id and id, because a feature can have "child"-features or "parent"-features. There is a foreign key relationship established with this fields.
KEY `fk_features_Features1_idx` (`parents_features_id`),
CONSTRAINT `fk_features_Features1` FOREIGN KEY (`parents_features_id`) REFERENCES `features` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
I used "cake bake all features" to create all necessary models, controllers etc.
When I open the features page I only get the error message "Features is not associated with ParentsFeatures"
I tried to solve it by exchanging the automatic original code
$this->belongsTo('Features', [
'foreignKey' => 'parents_features_id'
in Model/Table/FeaturesTable.php for this relationship by the following code:
$this->belongsTo('ParentsFeatures', [
'className' => 'Features',
'foreignKey' => 'parents_features_id'
]);
$this->hasMany('ChildFeatures', [
'className' => 'Features',
'foreignKey' => 'parents_features_id'
But than I get the error message "Features is not associated with Features"
I am a little bit stuck here and would really appreciate any help to solve this.
all the best
Nadine

Making changes to the associations in your table will not change the way the baked controllers will query the data, you'll have to change them too, or re-bake your code.
However, this will never bake correctly unless you start following the conventions, that is name the foreign key column parent_id, only then bake will be able to create the proper associations (which will be named ParentFeatures and ChildFeatures), and while you're at it, consider changing the other column names too (lowercase underscored).
That being said, there might be a bug that occours somewhere between renaming indexes, creating and deleting foreign key constraints, etc (can't pinpoint it down), which then causes wrong association names used for contain in the find() calls of the controller actions, ie bake will generate something like
'contain' => ['Features']
instead of
'contain' => ['ParentFeatures']
which it used for the association in the baked table class.
However I'm not being able to reproduce it reliably right now. In case this is what you are experiencing, you might want to report this as an issue over at GitHub.
In case re-baking doesn't fix it, manually check the find() calls in your controller actions and change the contained association to ParentFeatures.

Mysql trigger or coding in PHP?

I have a table hardware_description:
+------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+--------------+------+-----+---------+----------------+
| Computer_id | int(11) | NO | PRI | NULL | auto_increment |
| Emp_id | int(11) | NO | MUL | NULL | |
| PC_type | varchar(20) | YES | | NULL | |
| Operating_system | varchar(20) | YES | | NULL | |
| Product_key | varchar(30) | YES | | NULL | |
| Assign_date | date | YES | | NULL | |
| DVD_ROM | varchar(20) | YES | | NULL | |
| CPU | varchar(30) | YES | | NULL | |
| IP_address | varchar(30) | YES | | NULL | |
| MAC_address | varchar(30) | YES | | NULL | |
| Model_name | varchar(30) | YES | | NULL | |
| Model_number | varchar(30) | YES | | NULL | |
| Monitor | varchar(30) | YES | | NULL | |
| Processor | varchar(30) | YES | | NULL | |
| Product_name | varchar(30) | YES | | NULL | |
| RAM | varchar(20) | YES | | NULL | |
| Serial_number | varchar(30) | YES | | NULL | |
| Vendor_id | varchar(30) | YES | | NULL | |
Emp_id is foreign key from employees table.
When I update a particular row, I want the existing data for that row to be saved in another table along with the timestamp of that update action. Now,
a) Shall I use PHP code (PDO transaction) to first grab that row & insert in another table then perform the UPDATE query on that particular row?
b) Use trigger on this table.
Which process is better practice & more efficient? Is there another way of achieving this?
I have not used trigger in my short career so far but I can do it if it is better practice.

If you can do a trigger, it would be a lot better to use that.
The reason for this is that if for some reason you forget to write the PHP code to do this (in some weird situation) - you would have missing, unrelated data - otherwise known as orphaned data, which does not have a corresponding row or set of rows.
Here's the link to the MySQL documentation page for triggers: http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Data migration into CiviCRM - keep legacy IDs - php

Related

Reports in Laravel 5.2

How can a simple MySQL insert/update be slower than an external web request?

Laravel 5 custom database sessions?

Can't solve child parent association in cakephp 3

Mysql trigger or coding in PHP?

Categories

Resources