Categories

(111)
(64)
(166)
(21)

Migrate - module for data import in Drupal

04.07.2014
Migrate module
Author:

Sometimes while website development it is necessary to transfer data from one database to another. Often it is either migration to a newer Drupal version (from 6.x to 7.x) or transfer of content to Drupal from another platform. Migrate module makes a very convenient tool for importing data in such cases. But sometimes its default settings are just not enough for the transfer (for example, when importing data from different fields into one, or, if it is necessary to consider time shift between servers for date type fields). In such cases web developers need to extend module’s possibilities by themselves. Below we will discuss techniques that will help you to solve these issues.

Let’s agree that the current database name is 'drupal_7', and the name of the dump from which you want to import the data - 'drupal_6'. Then you need to register both connections to databases in settings.php to switch between them conveniently:

  $databases = array (
    'default' => array (
      'default' => array (
        'database' => 'drupal_7',
        'username' => 'root',
        'password' => '*****',
        'host' => 'localhost',
        'port' => '',
        'driver' => 'mysql',
         'prefix' => '',
      ),
    ),
    'old_database' => array (
      'default' => array (
        'database' => 'drupal_6',
        'username' => 'root',
        'password' => '*****',
        'host' => 'localhost',
        'port' => '',
        'driver' => 'mysql',
        'prefix' => '',
      ),
    ),
  );

As we see the connection key 'old_database' will be used in our case. To define it conveniently during the import process you need to create an admin page with the appropriate settings. Also here you can add settings for old site domain name (if this site is still functioning) and time shift value on the server (if it is different on two sites). We will need an old site domain name to get files through the existing path in the database.

Let's create a module that will contain the necessary settings and classes for our import. The name of the module will be 'mymodule_migration'. Here you also need to set the administration page with the required parameters. Then our mymodule_migration.module will look like this:

/**
 * Implements hook_menu().
 */
function mymodule_migration_menu() {
  $items['admin/content/migrate/mymodule_migration'] = array(
    'title'            => 'Additional Migration settings',
    'page callback'    => 'drupal_get_form',
    'page arguments'   => array('mymodule_migration_settings'),
    'access arguments' => array(MIGRATE_ACCESS_BASIC),
    'file'             => 'mymodule_migration.admin.inc',
    'type'             => MENU_LOCAL_TASK,
    'weight'           => 10,
  );

  return $items;
}

And mymodule_migration.admin.inc will contain the administration form:

/**
 * Migration settings form.
 */
function mymodule_migration_settings($form, &$form_state) {
  $form['mymodule_migration_database_key'] = array(
    '#type'     => 'textfield',
    '#title'    => t('D6 database key'),
    '#required' => TRUE,
    '#default_value' => variable_get('mymodule_migration_database_key', ''),
  );
  $form['mymodule_migration_old_site_domain'] = array(
    '#type'     => 'textfield',
    '#title'    => t('Old site domain name'),
    '#required' => TRUE,
    '#default_value' => variable_get('mymodule_migration_old_site_domain', ''),
 );
  $form['mymodule_migration_time_shift'] = array(
    '#type'     => 'textfield',
    '#title'    => t('Time shift between sites'),
    '#default_value' => variable_get('mymodule_migration_time_shift', ''),
    '#description'   => t('A date/time string. e.g. "-1 hour".'),
  );

  system_settings_form($form);
}

Let’s now take a look at Migrate module custom classes. They will describe the process of building a query to obtain the necessary data from the database, as well as maps of correspondences between fields. For this purpose hook_migrate_api() is used. It should be described in MODULENAME.migrate.inc file. In our case it is called mymodule_migration.migrate.inc. You should create 'includes' folder in the module’s directory which will contain all the necessary Migrate’s .inc files (including class files as well). Describing the class for Page content type migration we will get mymodule_migration.migrate.inc in such form:

/**
 * Implements hook_migrate_api().
 */
function mymodule_migration_migrate_api() {
  $api = array(
    'api' => 2,
    'groups' => array(
      'mymodule' => array(
        'title' => t(‘Mymodule Imports'),
      ),
    ),
    'migrations' => array(
      'MigratePages' => array(
        'class_name' => 'MymodulePagesMigration',
        'title'      => 'Pages',
        'group_name' => 'mymodule',
      ),
    ),
  );
  return $api;
}

If you need to import only one content type than our new class should inherit the basic Migration class. While importing multiple content types it is better to create the import’s basic class which will inherit basic Migration class. Thus all other classes of a particular content type migration will inherit our basic class.

Let’s proceed to creation of a basic class. First create “node.inc” file in “includes” folder. Describe your class there:

/**
 * @file
 * Basic class to handle the nodes migration.
 */
class MymoduleNodeMigration extends Migration {
  /**
   * General initialization of a Migration object.
   */
  public function __construct($args, $type) {
    parent::__construct($args);
  }
}

Here you need to add some protected class properties which we will need later:

  • $migrateUid - author identifier (you can set 1 by default),

  • $databaseKey - database key of site’s old version,

  • $oldSiteDomain - domain name of site’s old version,

  • $timeShift - time shift between servers,

  • $sourceType - the type of imported (source) content,

  • $destinationType - the name of the destination content type (to which the data is imported).

Now it is necessary to change our class constructor and to fill in the described properties. Constructor should look like this:

/**
 * General initialization of a Migration object.
 */
public function __construct($args, $type) {
  parent::__construct($args);

  $this->databaseKey     = variable_get('mymodule_migration_database_key');
  $this->oldSiteDomain   = variable_get('mymodule_migration_old_site_domain');
  $this->timeShift       = variable_get('mymodule_migration_time_shift');
  $this->description     = t('Migrate Nodes');
  $this->destinationType = $type;
}

Let’s create a method that will be responsible for the construction of correspondences between content type fields and query fields - buildSimpleFieldsMapping(). This is an important point which determines which values a new content type fields ​​will be filled with.

/**
 * Build fields mapping.
 */
public function buildSimpleFieldsMapping() {
  $source_fields = array(
    'nid'               => t('The node ID of the page'),
    'linked_files'      => t('The set of linked files'),
    'right_side_images' => t('The set of images that previously appeared on the side'),
  );

  // Setup common mappings.
  $this->addSimpleMappings(array('title', 'status', 'created', 'changed', 'comment', 'promote', 'sticky'));

  // Make the mappings.
  $this->addFieldMapping('is_new')->defaultValue(TRUE);
  $this->addFieldMapping('uid')->defaultValue($this->migrateUid);
  $this->addFieldMapping('revision')->defaultValue(TRUE);
  $this->addFieldMapping('revision_uid')->defaultValue($this->migrateUid);
  $this->addFieldMapping('language', 'language')->defaultValue('en');

  $this->addFieldMapping('body', 'body');
  $this->addFieldMapping('body:summary', 'teaser');
  $this->addFieldMapping('body:format', 'format');

  // Build base query and fields mapping.
  $query = $this->query();

  $this->highwaterField = array(
    'name'  => 'changed',
    'alias' => 'n',
    'type'  => 'int',
  );

  // Generate source from the base query.
  $this->source = new MigrateSourceSQL($query, $source_fields, NULL,
    array(
      'map_joinable' => FALSE,
      'cache_counts' => TRUE,
      'cache_key'    => 'migrate_' . $this->sourceType
    )
  );
  $this->destination = new MigrateDestinationNode($this->destinationType);

  // Build map.
  $this->map = new MigrateSQLMap($this->machineName,
    array(
      'nid' => array(
        'type'        => 'int',
        'unsigned'    => TRUE,
        'not null'    => TRUE,
        'description' => t('D6 Unique Node ID'),
        'alias'       => 'n',
      )
    ),
    MigrateDestinationTerm::getKeySchema()
  );
 } 

We have described the correspondence only between the basic fields but our content type may also contain some additional fields that should also be added to the map of correspondences. To build a data fetch query query() method is used. The very process should be divided into two parts:

  • information about additional fields;

  • adding of the fields to map of correspondences.

For getting some additional information about the fields you can use CSV file. You can enter the necessary data in the following format: “Source field", "Destination field", "Field description", "Field type". Such CSV file should exist for each content type. For a better organization you may create a separate folder for such files, e.g. 'mappings'.

 It is better to name the files according to the content type source. Here is an example of how such file should look like:

"Source field","Destination field","Description","Field type"
"field_attached_file","field_file","File","file"

Let’s get back to the query() method. We already have the necessary information about additional fields and we can finish with the construction of the fields correspondence map.

/**
 * Query for basic node fields from Drupal 6.
 *
 * @return QueryConditionInterface
 */
protected function query() {
  // Basic query for node.
  $query = Database::getConnection('default', $this->databaseKey)
    ->select('node', 'n')
    ->fields('n')
    ->condition('n.type', $this->sourceType)
    ->groupBy('n.nid')
    ->orderBy('n.changed');
  $query->join('node_revisions', 'nr', 'n.vid = nr.vid');
  $query->fields('nr', array('body', 'teaser', 'format'));
  $query->join('content_type_' . $this->sourceType, 'ct', 'n.vid = ct.vid');

  // Get fields mapping.
  $field_mappings = DRUPAL_ROOT . '/' . drupal_get_path('module', 'mymodule_migration') . '/mappings/' . $this->sourceType . '.csv';
  $result = fopen($field_mappings, 'r');

  // Make sure there actually map is exists.
  if (!$result) {
    Migration::displayMessage(t('Empty mapping for !type', array('!type' => $this->sourceType)));
    return array();
  }

  // Skip the header.
  fgets($result);
  $fields = array();

  // Run through the mappings - 0 is a source field name, 1 is a destination
  // field name, 2 is a description of the mapping, 3 is a field type.
  while ($row = fgetcsv($result)) {
    $cck_field         = $row[0];
    $destination_field = $row[1];
    $field_description = $row[2];
    $field_type        = $row[3];

    // Getting field columns.
    $field_columns = Database::getConnection('default', $this->databaseKey)
      ->select('content_node_field', 'nf')
      ->fields('nf', array('db_columns'))
      ->condition('field_name', $cck_field)
      ->execute()
      ->fetchField();
    $field_columns = unserialize($field_columns);

    switch ($field_type) {
      case 'term':
        // For the term field the $cck_field should be a vocabulary id.
        $column_field = "{$destination_field}_{$cck_field}";
        $query->leftJoin('term_node', 'tn', 'n.vid = tn.vid');
        $query->leftJoin('term_data', 'td', "tn.tid = td.tid AND td.vid = '{$cck_field}'");
        $query->addField('td', 'name', $column_field);
        $this->addFieldMapping($destination_field, $column_field);
        $this->term_fields[$column_field] = $cck_field;
        break;

      default:
        // Checking which table should be used for getting field values.
        $argument = 0;
        if (Database::getConnection('default', $this->databaseKey)->schema()->tableExists('content_' . $cck_field)) {
          $query->leftJoin('content_' . $cck_field, $cck_field, "n.vid = {$cck_field}.vid");
          foreach (array_keys($field_columns) as $column_name) {
            $column_field = "{$cck_field}_{$column_name}";
            $this->buildFieldMapping($destination_field, $cck_field, $column_name, $column_field, $field_type, $argument);
            if (in_array($field_type, array('file', 'image', 'multiple'))) {
              $query->addExpression("GROUP_CONCAT(DISTINCT {$cck_field}.{$column_field})", $column_field);
            }
            else {
              $query->addField($cck_field, $column_field, $column_field);
            }
            $argument++;
          }
        }
        elseif (!empty($field_columns)) {
          foreach (array_keys($field_columns) as $column_name) {
            $column_field = "{$cck_field}_{$column_name}";
            $this->buildFieldMapping($destination_field, $cck_field, $column_name, $column_field, $field_type, $argument);
            $query->addField('ct', $column_field, $column_field);
            $argument++;
          }
        }
    }
   // Build a specific field mapping by the field instance.
   if (!empty($field_columns)) {
     $this->buildFieldInstanceMapping($destination_field, $cck_field, $field_type, $field_columns);
   }
  }
  return $query;
}

To add the additional fields to the map buildFieldMapping() method is used. In our case this method will look like this:

/**
 * Add field mapping.
 */
protected function buildFieldMapping($destination_field, $cck_field, $column_name, $column_field, $field_type, $argument) {
  $column_field = "{$cck_field}_{$column_name}";
  if (empty($argument)) {
    if (!isset($this->codedFieldMappings[$destination_field])) {
      if (in_array($field_type, array('file', 'image', 'multiple'))) {
        $this->addFieldMapping($destination_field, $column_field)
          ->separator(',');
      }
      else {
        $this->addFieldMapping($destination_field, $column_field);
      }
    }
    else {
      $this->duplicate_destination[$column_field] = $destination_field;
    }
  }
  else {
    $this->addFieldMapping("{$destination_field}:{$column_name}", $column_field);
  }
  if ($column_name == 'format' && $field_type == 'text') {
    $this->text_format_fields[] = $column_field;
  }
}

Also call for buildFieldInstanceMapping() method is used in query(). It had been developed for performing some additional actions with fields of 'file' and 'image'  types:

/**
 * Add a specific field mapping according to the field instance.
 */
protected function buildFieldInstanceMapping($destination_field, $cck_field, $field_type, $field_columns) {
  switch ($field_type) {
    case 'file':
    case 'image':
      // Get field instance settings.
      $instance_settings = field_info_instance('node', $destination_field, $this->destinationType);
      // The file_class determines how the 'image' value is interpreted, and what
      // other options are available. In this case, MigrateFileUri indicates that
      // the 'image' value is a URI.
      $this->addFieldMapping("{$destination_field}:file_class")
        ->defaultValue('MigrateFileUri');
      // Here we specify the directory containing the source files.
      $this->addFieldMapping("{$destination_field}:source_dir")
        ->defaultValue($this->oldSiteDomain);
      // Directory that source images will be copied to.
      $this->addFieldMapping("{$destination_field}:destination_dir")
        ->defaultValue(file_default_scheme() . '://' . $instance_settings['settings']['file_directory']);
      // And we map the alt and title values in the database to those on the image.
      $this->addFieldMapping("{$destination_field}:alt", 'title');
      $this->addFieldMapping("{$destination_field}:title", 'title');

      $this->addUnmigratedDestinations(array(
        "{$destination_field}:destination_file",
        "{$destination_field}:file_replace",
        "{$destination_field}:language",
        "{$destination_field}:preserve_files",
        "{$destination_field}:urlencode",
      ));
      $this->file_fields[] = $cck_field . '_' . key($field_columns);
      break;

    case 'date':
      $this->date_fields[] = $cck_field . '_' . key($field_columns);
      break;
  }
}

As it was mentioned above, the specific actions with the concrete fields can be performed using prepareRow() and prepare() methods. Here is the demonstration of prepareRow() use on the example of date fields. Information about them is stored in 'date_fields' array of our migration class instance:

/**
  * Implemens prepareRow().
  *
  * @param Object $row
  *   Object containing raw source data.
  *
  * @return bool
  *   TRUE to process this row, FALSE to have the source skip it.
  */
public function prepareRow($row) {
  // Make a time shift for date fields.
  if (!empty($this->date_fields) && !empty($this->timeShift)) {
    foreach ($this->date_fields as $date_field) {
      if (!empty($row->$date_field)) {
        $row->$date_field = date('Y-m-d\TH:i:s', strtotime($this->timeShift, strtotime($row->$date_field)));
      }
    }
  }
}

As for prepare() method it takes two parameters - prepare ($node, stdClass $row). This allows to manipulate with the very object of content type which will be saved having an access to $row as well.

Now let’s get back to the creation of migration class for a specific content type - Page which  has already been described in hook_migrate_api. Let’s create page.inc file in ‘includes’ folder with migration class of this content type:

/**
 * @file
 * Handle the migration Pages.
 */

class MymodulePagesMigration extends MymoduleNodeMigration {
  /**
  * General initialization of a Migration object.
  */
  public function __construct($args) {
     parent::__construct($args, 'page');

     $this->sourceType  = 'page';
     $this->description = t('Migrate Pages');

     if (!empty($this->databaseKey)) {
       $this->buildSimpleFieldsMapping();
     }
  }
}

If it is necessary to perform the specific operations with content type field you can do this directly in its import class using the same methods prepareRow() and prepare().

Here we have outlined the basic principles of work with content import using database along with map of fields correspondence dynamic building and processing. We hope that you have found something new and useful in this blog!

10 votes, Rating: 5

Read also

1

Developing our store package CommerceBox, we have faced an issue of keeping specific functionality for apps in features. Feature Injection technique solved this problem.

2

Being a member of Drupal Community is very responsible. That is why internetDevels company organized regular 3-months contribution code sprint. The results are promising.

3

Node.js is a powerful platform for building all kinds of applications which can be integrated with Drupal 7. From our blog you will learn how to install and set up...

4

Often one can see such charges against Drupal: “It is impossible to build a high-load website using this framework”. We will refute this myth on...

5

One of the main issues after making decision to launch a website is development platform - CMS or custom coding? Let's elucidate Drupal framework advantages.

Need a quote? Let's discuss the project

Are you looking for someone to help you with your Drupal Web Development needs? Let’s get in touch and discuss the requirements of your project. We would love to hear from you.

Join the people who have already subscribed!

Want to be aware of important and interesting things happening? We will inform you about new blog posts on Drupal development, design, QA testing and more, as well news about Drupal events.

No charge. Unsubscribe anytime