CMS Migration Patterns

This document provides a reference for AI agents and developers migrating content from popular CMS platforms into this project.

WordPress (SQL/CSV)

Primary Tables

  • wp_posts: Contains posts, pages, and attachments.

    • ID: Primary key.

    • post_title: Title.

    • post_content: Main body (HTML).

    • post_name: Slug.

    • post_excerpt: Excerpt.

    • post_status: 'publish', 'draft', 'inherit' (for attachments).

    • post_type: 'post', 'page', 'attachment', 'revision'.

    • post_author: User ID.

    • guid: Full URL (often used for media).

  • wp_postmeta: Key-value pairs for posts.

    • post_id: Links to wp_posts.ID.

    • meta_key: e.g., _yoast_wpseo_title, _thumbnail_id, _wp_attached_file.

    • meta_value: The value.

  • wp_users: User data (ID, user_email, display_name).

  • wp_terms, wp_term_taxonomy, wp_term_relationships: Categories and Tags.

Recommended Mapping to the CMS

  • Posts/Pages:

    • Source: wp_posts WHERE post_type IN ('post', 'page').

    • Target: posts.

    • Content: Convert post_content (HTML) into a Prose module using contentToModules.

  • Media:

    • Source: wp_posts WHERE post_type = 'attachment'.

    • Target: media_assets.

    • URL: Use guid.

    • Alt Text: Lookup _wp_attachment_image_alt in wp_postmeta.

  • SEO (Yoast):

    • Source: wp_postmeta (lookup by post_id).

    • Title: _yoast_wpseo_title.

    • Description: _yoast_wpseo_metadesc.

  • Taxonomies:

    • category -> categories.

    • post_tag -> tags.

Handling Shortcodes (e.g., Enfold/Avia)

Many WordPress themes (like Enfold) store layout data as shortcodes in post_content or post_excerpt.

  • Pattern: [av_textblock]Content[/av_textblock].

  • Strategy: Initially import as a Prose module. For complex shortcodes (sliders, galleries), recommend creating specific modules and a follow-up parser.

Directus (SQL/JSON)

Primary Tables

  • directus_collections: List of content types.

  • directus_fields: Column definitions.

  • directus_users: User accounts.

  • directus_files: Media metadata.

  • User-defined tables: e.g., articles, pages.

Recommended Mapping

  • Content: Map user tables directly to posts.

  • Media: directus_files -> media_assets. Use filename_download for the URL.

  • Relationships: Check directus_relations to identify many-to-many links.

Drupal (SQL)

Primary Tables (Drupal 8/9/10)

  • node: Basic node info.

  • node_field_data: Primary content table.

    • nid: Node ID.

    • title: Title.

    • status: Published status.

  • node__body: Content body.

    • entity_id: Links to nid.

    • body_value: HTML content.

  • file_managed: Media files.

    • uri: Path to file (e.g., public://image.jpg).

Canonical JSON Structure

The migration agent should always aim to produce a JSON with the following tables:

1. posts

Standard fields: id, type, slug, title, excerpt, status, user_id.

2. module_instances

Every post should have its content stored in modules.

  • id: UUID.

  • type: e.g., "prose".

  • props: JSON object (e.g., {"content": "HTML_HERE"}).

3. post_modules

Joins posts to their modules.

  • id: UUID.

  • post_id: Reference to post.

  • module_id: Reference to module_instance.

  • order_index: Integer.

4. media_assets

  • id: UUID.

  • url: External URL (the system will download it) or internal path.

  • mime_type, alt_text.