Import massive csv data with symfony command too slow with doctrine - php

I need to import a lot of data from a csv file (45 Mo) in myqsl database with Symfony. I imported League\Csv\Reader library
I made a command with doctrine.
It works but I is very slow.
How can I accelerate this ?
I tried to :
adding : $this->em->clear() after $this->em->flush();
adding : //Disable SQL Logging: to avoid huge memory loss.
$this->em->getConnection()->getConfiguration()->setSQLLogger(null);
.
namespace App\Command;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
use Symfony\Component\Console\Style\SymfonyStyle;
use App\Entity\Developer;
use App\Entity\BadgeLabel;
use Doctrine\ORM\EntityManagerInterface;
use League\Csv\Reader;
class CsvImportCommand extends Command
{
public function __construct(EntityManagerInterface $em){
parent::__construct();
$this->em = $em;
}
// the name of the command (the part after "bin/console")
protected static $defaultName = 'app:import-developpers';
protected function configure()
{
$this
// the short description shown while running "php bin/console list"
->setDescription('Import a new developper.')
// the full command description shown when running the command with
// the "--help" option
->setHelp('This command allows you to import a develpper...')
;
}
protected function execute(InputInterface $input, OutputInterface $output)
{
$io = new SymfonyStyle($input, $output);
$io->title('Importation en cours');
$reader = Reader::createFromPath('%kernel.root_dir%/../src/Data/developers_big.csv')
->setHeaderOffset(0)
;
$results = $reader->getrecords();
$io->progressStart(iterator_count($results));
//Disable SQL Logging: to avoid huge memory loss.
$this->em->getConnection()->getConfiguration()->setSQLLogger(null);
foreach ($results as $row) {
$developer = $this->em->getRepository(Developer::class)
->findOneBy([
'firstName' => ($row['FIRSTNAME']),
'lastName'=> ($row['LASTNAME'])
])
;
if (null === $developer) {
$developer = new developer;
$developer
->setFirstName($row['FIRSTNAME'])
->setLastName($row['LASTNAME']);
$this->em->persist($developer);
$this->em->flush();
$this->em->clear();
}
$badgeLabel = $this->em->getRepository(BadgeLabel::class)
->findOneBy([
'name' => ($row['BADGE LABEL']),
'level'=> ($row['BADGE LEVEL'])
])
;
if (null === $badgeLabel) {
$badgeLabel = new BadgeLabel;
$badgeLabel
->setName($row['BADGE LABEL'])
->setLevel($row['BADGE LEVEL']);
$this->em->persist($badgeLabel);
$this->em->flush();
$this->em->clear();
}
$developer
->addBadgeLabel($badgeLabel);
$io->progressAdvance();
}
$this->em->flush();
$this->em->clear();
$io->progressFinish();
$io->success('Importation terminée avec succès');
}
}
The command works put its to slow. After 15 min, only 32% was updload in my Mysql database. I Expected it in 2 minutes max

Method1: (not the best)
When flush method is called, Symfony go throught all listeners. So, you could avoid to flush on each loop. You can replace each flush by this code:
if (0 === ($batchSize++ % $input->getOption('fetch'))) {
$this->entityManager->flush();
$this->entityManager->clear();
}
fetch option can be declared in configure method:
const BATCH_SIZE = 1000; // As example
/**
* Configure the command.
*/
protected function configure()
{
$this
// the short description shown while running "php bin/console list"
->setDescription('Import a new developper.')
//This option helps you to find a good value and use BATCH_SIZE constant as default
->addOption('fetch', 'f', InputArgument::OPTIONAL, 'Number of loop between each flush', self::BATCH_SIZE)
// the full command description shown when running the command with
// the "--help" option
->setHelp('This command allows you to import a develpper...')
;
Method2: More efficient
You can create a command which writes all SQL queries with update or insert in a sql file. Then, you launch a native command that read the files and execute queries.
Method3: Using DBAL
As suggested in comments, youcould use DBAL to avoid unnecessary object hydration with Doctrine.

Related

Create Artisan command with option that must specify a value

Laravel's documentation says (emphasis mine):
If the user must specify a value for an option, you should suffix the option name with a = sign…
But then goes on to say:
If the option is not specified when invoking the command, its value will be null…
Which suggests that "must" doesn't mean what I think it means. And indeed that is the case. A simple command with a signature like this:
protected $signature = "mycommand {-t|test=}";
Will run just fine when called like artisan mycommand -t. And what's worse is that if you specify a default value, it isn't applied in this case.
protected $signature = "mycommand {-t|test=42}";
When running artisan mycommand, $this->option('test') will give you a value of 42, but when run as artisan mycommand -t it gives a value of null.
So, is there a way to require that a user must (actually) specify a value for a given option, if it's present on the command line?
Poking around the Laravel code, I confirmed that there is no way to have a truly "required" value. Although Symfony does provide for required values, Laravel doesn't use this capability. Instead the options are all created as optional, so I will have to write my own parser...
This was fairly straightforward; I had to write a custom parser class to override the Illuminate\Console\Parser::parseOption() method, and then override Illuminate\Console\Command::configureUsingFluentDefinition() to use that new class.
I elected to create a new option type, rather than change the behaviour of any existing command options. So now I declare my signature like this when I want to force a value:
<?php
namespace App\Console\Commands;
use App\Console\Command;
class MyCommand extends Command
{
/** #var string The double == means a required value */
protected $signature = "mycommand {--t|test==}";
...
}
Attempting to run artisan mycommand -t will now throw a Symfony\Component\Console\Exception\RuntimeException with a message of "The --test option requires a value." This also works for array options (--t==*) and/or options with default values (--t==42 or --t==*42.)
Here's the code for the new parser class:
<?php
namespace App\Console;
use Illuminate\Console\Parser as BaseParser;
use Symfony\Component\Console\Input\InputOption;
class Parser extends BaseParser
{
protected static function parseOption($token): InputOption
{
[$mytoken, $description] = static::extractDescription($token);
$matches = preg_split("/\\s*\\|\\s*/", $mytoken, 2);
if (isset($matches[1])) {
$shortcut = $matches[0];
$mytoken = $matches[1];
} else {
$shortcut = null;
}
switch (true) {
case str_ends_with($mytoken, "=="):
return new InputOption(
trim($mytoken, "="),
$shortcut,
InputOption::VALUE_REQUIRED,
$description
);
case str_ends_with($mytoken, "==*"):
return new InputOption(
trim($mytoken, "=*"),
$shortcut,
InputOption::VALUE_REQUIRED | InputOption::VALUE_IS_ARRAY,
$description
);
case preg_match("/(.+)==\*(.+)/", $mytoken, $matches):
return new InputOption(
$matches[1],
$shortcut,
InputOption::VALUE_REQUIRED | InputOption::VALUE_IS_ARRAY,
$description,
preg_split('/,\s?/', $matches[2])
);
case preg_match("/(.+)==(.+)/", $mytoken, $matches):
return new InputOption(
$matches[1],
$shortcut,
InputOption::VALUE_REQUIRED,
$description,
$matches[2]
);
default:
// no == here, fall back to the standard parser
return parent::parseOption($token);
}
}
}
And the new command class:
<?php
namespace App\Console;
use Illuminate\Console\Command as BaseCommand;
class Command extends BaseCommand
{
/**
* Overriding the Laravel parser so we can have required arguments
*
* #inheritdoc
* #throws ReflectionException
*/
protected function configureUsingFluentDefinition(): void
{
// using our parser here
[$name, $arguments, $options] = Parser::parse($this->signature);
// need to call the great-grandparent constructor here; probably
// could have hard-coded to Symfony, but better safe than sorry
$reflectionMethod = new ReflectionMethod(
get_parent_class(BaseCommand::class),
"__construct"
);
$reflectionMethod->invoke($this, $name);
$this->getDefinition()->addArguments($arguments);
$this->getDefinition()->addOptions($options);
}
}

PHP/Symfony 2 - Fetch data from db, edit it and resend to another server

In my symfony 2 app by using a console command I want to fetch data from database table. Looping through I want to change them a bit (e.g multiply some values) and then send it to database on another server (by curl).
How can I set new names of the columns and assign those data to them? And also how to send it as an .sql file?
Just to give you brief notion here is a raw version of my command:
class MyCommand extends ContainerAwareCommand
{
private $input;
private $output;
protected function configure()
{
// my configs
}
protected function execute(InputInterface $input, OutputInterface $output)
{
$entityManager = $this->getContainer()->get('doctrine')->getManager('default');
$queryBuilder = $entityManager->createQueryBuilder();
$query = $queryBuilder
-> // my query params
foreach ($query->iterate() as $e)
{
// loop through results of the query in order to create a new DB table
}
$this->sendData($output);
}
}
Thank you in advance for any help and advises!
You can create another entity manager which connects to a different database. You can see that here, it's a fairly simple yaml config.
Just bring both entity managers (your 'default' and your 'other') then take what data you need and persist / flush to your other db.
Like so:
class MyCommand extends ContainerAwareCommand
{
private $input;
private $output;
protected function configure()
{
// my configs
}
protected function execute(InputInterface $input, OutputInterface $output)
{
$entityManager = $this->getContainer()->get('doctrine')->getManager('default');
$otherEntityManager = $this->getContainer()->get('doctrine')->getManager('other');
$queryBuilder = $entityManager->createQueryBuilder();
$query = $queryBuilder
-> // my query params
foreach ($query->iterate() as $e)
{
// loop through results
$otherEntity = new OtherEntity()
->setFirstProperty($first)
->setSecondProperty($second)
;
$otherEntityManaer->persist($otherEntity);
}
$otherEntityManager->flush();
}
}
Generate a sql file and send by curl to another server executes it doesn't seems the best approach at all. If someone else sends a malicious sql file to be executed on the server?
Doctrine provides a feature to a connect on multiple databases. Maybe you could use this functionality instead of sending the sql file.
Answering your question, inside your loop statement you could generate the INSERTS/UPDATES sql statements that you need and saving each one inside a text file using a function like fwrite
Take a look for this function here: http://php.net/manual/en/function.fwrite.php

How to test Doctrine Migrations?

I'm working on a project that does NOT have a copy of production DB on development environment.
Sometimes we have an issue with DB migrations - they pass on dev DB but fail in production/testing.
It's often beacuse Dev environent data is loaded from Fixtures that use the latest entities - filling all tables properly.
Is there any easy way to make sure Doctrine Migration(s) will pass in production?
Do you have/know any way to write an automatic tests that will make sure data will be migrated properly without downloading the production/testing DB and running the migration manually?
I would like to avoid downloading a production/testing DB to dev machine so I can check migrations becasue that DB contains private data and it can be quite big.
First, you need to create a sample database dump in state before the migration. For MySQL use mysqldump. For postgres pg_dump, e.g.:
mysqldump -u root -p mydatabase > dump-2018-02-20.sql
pg_dump -Upostgres --inserts --encoding utf8 -f dump-2018-02-20.sql mydatabase
Then create an abstract class for all migrations tests (I assume you have configured a separate database for integration testing in config_test.yml):
abstract class DatabaseMigrationTestCase extends WebTestCase {
/** #var ResettableContainerInterface */
protected $container;
/** #var Application */
private $application;
protected function setUp() {
$this->container = self::createClient()->getContainer();
$kernel = $this->container->get('kernel');
$this->application = new Application($kernel);
$this->application->setAutoExit(false);
$this->application->setCatchExceptions(false);
$em = $this->container->get(EntityManagerInterface::class);
$this->executeCommand('doctrine:schema:drop --force');
$em->getConnection()->exec('DROP TABLE IF EXISTS public.migration_versions');
}
protected function loadDump(string $name) {
$em = $this->container->get(EntityManagerInterface::class);
$em->getConnection()->exec(file_get_contents(__DIR__ . '/dumps/dump-' . $name . '.sql'));
}
protected function executeCommand(string $command): string {
$input = new StringInput("$command --env=test");
$output = new BufferedOutput();
$input->setInteractive(false);
$returnCode = $this->application->run($input, $output);
if ($returnCode != 0) {
throw new \RuntimeException('Failed to execute command. ' . $output->fetch());
}
return $output->fetch();
}
protected function migrate(string $toVersion = '') {
$this->executeCommand('doctrine:migrations:migrate ' . $toVersion);
}
}
Example migration test:
class Version20180222232445_MyMigrationTest extends DatabaseMigrationTestCase {
/** #before */
public function prepare() {
$this->loadDump('2018-02-20');
$this->migrate('20180222232445');
}
public function testMigratedSomeData() {
$em = $this->container->get(EntityManagerInterface::class);
$someRow = $em->getConnection()->executeQuery('SELECT * FROM myTable WHERE id = 1')->fetch();
$this->assertEquals(1, $someRow['id']);
// check other stuff if it has been migrated correctly
}
}
I've figured out simple "smoke tests" for Doctrine Migrations.
I have PHPUnit test perfoming following steps:
Drop test DB
Create test DB
Load migrations (create schema)
Load fixtures (imitate production data)
Migrate to some older version
Migrate back to the latest version
This way I can test for the major issues, we've had recently.
Example of PHPUnit tests can be found on my blog: http://damiansromek.pl/2015/09/29/how-to-test-doctrine-migrations/

php app / console stops working

After a slight modification of my units, I wanted the update with a simple php app/console doctrine: update --force. But no action executed and in addition no response. I then did a php app/check.php meaning me no problems (Your system is ready to run Symfony2 projects). I do not understand and it doesn't provide an error. Here's what I've done:
Command: ********: ***** ProjetSymphony $ php app / console***
Answer (none): ******* **** $ ProjetSymphony***
If someone has an idea.
Screen :
Try with:
php app/console doctrine:schema:update --force
Maybe it's only a syntaxis error.
Also, if anyone tries to run php app/console in a newer symfony version (for example symfony 3.0), you will get an error: no file found because the file was moved to 'bin' folder. Now to run from the console, you have to use php bin/console instead. Just in case this change confused anyone who started to learn symfony and updated to 3.0.
I finally found my mistake. I had a command file that prevented the execution of my order (CreateUserCommand.php)
If someone wants to explain to me why this cosait file an error during the execution of my order ...
Here is the file :
<?php
namespace FP\UserBundle\Command;
use Symfony\Bundle\FrameworkBundle\Command\ContainerAwareCommand;
use Symfony\Component\Console\Input\InputArgument;
use Symfony\Component\Console\Input\InputOption;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
use FOS\UserBundle\Model\User;
use FOS\UserBundle\Command\CreateUserCommand as BaseCommand;
class CreateUserCommand extends BaseCommand
{
/**
* #see Command
*/
protected function configure()
{
exit;
echo "tes";
parent::configure();
$this
->setName('fp:user:create')
->getDefinition()->addArguments(array(
new InputArgument('age', InputArgument::REQUIRED, 'The age')
))
;
}
/**
* #see Command
*/
protected function execute(InputInterface $input, OutputInterface $output)
{
exit;
echo "tes";
$username = $input->getArgument('username');
$email = $input->getArgument('email');
$password = $input->getArgument('password');
$age = $input->getArgument('age');
$inactive = $input->getOption('inactive');
$superadmin = $input->getOption('super-admin');
$manipulator = $this->getContainer()->get('fos_user.util.user_manipulator');
$manipulator->setAge($age);
$manipulator->create($username, $password, $email, !$inactive, $superadmin);
$output->writeln(sprintf('Created user <comment>%s</comment>', $username));
}
/**
* #see Command
*/
protected function interact(InputInterface $input, OutputInterface $output)
{
exit;
echo "tes";
parent::interact($input, $output);
if (!$input->getArgument('age')) {
$age = $this->getHelper('dialog')->askAndValidate(
$output,
'Please choose a age:',
function($age) {
if (empty($age)) {
throw new \Exception('Lastname can not be empty');
}
return $age;
}
);
$input->setArgument('age', $age);
}
}
}

Running console command from a Symfony 2 test case

Is there a way to run a console command from a Symfony 2 test case? I want to run the doctrine commands for creating and dropping schemas.
This documentation chapter explains how to run commands from different places. Mind, that using exec() for your needs is quite dirty solution...
The right way of executing console command in Symfony2 is as below:
Option one
use Symfony\Bundle\FrameworkBundle\Console\Application as App;
use Symfony\Component\Console\Tester\CommandTester;
class YourTest extends WebTestCase
{
public function setUp()
{
$kernel = $this->createKernel();
$kernel->boot();
$application = new App($kernel);
$application->add(new YourCommand());
$command = $application->find('your:command:name');
$commandTester = new CommandTester($command);
$commandTester->execute(array('command' => $command->getName()));
}
}
Option two
use Symfony\Component\Console\Input\StringInput;
use Symfony\Bundle\FrameworkBundle\Console\Application;
class YourClass extends WebTestCase
{
protected static $application;
public function setUp()
{
self::runCommand('your:command:name');
// you can also specify an environment:
// self::runCommand('your:command:name --env=test');
}
protected static function runCommand($command)
{
$command = sprintf('%s --quiet', $command);
return self::getApplication()->run(new StringInput($command));
}
protected static function getApplication()
{
if (null === self::$application) {
$client = static::createClient();
self::$application = new Application($client->getKernel());
self::$application->setAutoExit(false);
}
return self::$application;
}
}
P.S. Guys, don't shame Symfony2 with calling exec()...
The docs tell you the suggested way to do it. The example code is pasted below:
protected function execute(InputInterface $input, OutputInterface $output)
{
$command = $this->getApplication()->find('demo:greet');
$arguments = array(
'command' => 'demo:greet',
'name' => 'Fabien',
'--yell' => true,
);
$input = new ArrayInput($arguments);
$returnCode = $command->run($input, $output);
// ...
}
Yes, if your directory structure looks like
/symfony
/app
/src
then you would run
phpunit -c app/phpunit.xml.dist
from your unit tests you can run php commands either by using
passthru("php app/console [...]") (http://php.net/manual/en/function.passthru.php)
exec("php app/console [...]") (http://www.php.net/manual/en/function.exec.php)
or by putting the command in back ticks
php app/consode [...]
If you are running the unit tests from a directory other than symofny, you'll have to adjust the relative path to the app directory for it to work.
To run it from the app:
// the document root should be the web folder
$root = $_SERVER['DOCUMENT_ROOT'];
passthru("php $root/../app/console [...]");
The documentation has been updated since my last answer to reflect the proper Symfony 2 way of calling an existing command:
http://symfony.com/doc/current/components/console/introduction.html#calling-an-existing-command

Categories