Data Import & Seeding Flow

One-time database population from supplied CSV files

Overview

This workflow describes how the comprehensive skills, behaviours, knowledge areas, and activity data (supplied as CSV files) are imported into the database during initial setup.

64 Skills

Hierarchical taxonomy across 6 categories

Source: Skills-Skills.csv

22 Behaviours

Flat list of character traits

Source: Behaviours-Table 1.csv

7+ Knowledge Areas

Household, Society, Well-being, Travel, etc.

Source: Multiple [Category]-Table 1.csv files

100+ Activities

Progressive activities with levels 1-3

Source: Multiple activity CSV files

Database Seeding Pipeline

Start

Application First Run / Database Reset

Trigger: Developer runs database seed script or initial app setup

Controller: Orchestrator

seedDatabase()

Main orchestration function that calls all import functions in correct order

Dependencies: Skills must be imported before Activities (for FK relationships)

Controller: Skills Import

importSkillsFromCSV('Skills-Skills.csv')

  • Parses hierarchical structure: Category → Sub-category → Skill
  • Stores 64 skills across 6 categories:
    • Literacy (14), Communication (10), Numeracy (13)
    • Motor skills (4), Analytical (13), Cognitive (10)
  • Captures alternative names for fuzzy matching
  • Returns import summary (count, errors)
Database: Skills Table

64 Skill Records Created

Fields: Skill ID, Name, Category, Sub-category, Description, Alternative names

Controller: Behaviours Import

importBehavioursFromCSV('Behaviours-Table 1.csv')

  • Parses flat list of character traits
  • Stores 22 behaviours including: Independent, Motivated, Resilient, Curious, Empathic, Creative, Compassionate, Confident, Kind, etc.
  • Captures alternative names
  • Returns import summary
Database: Behaviours Table

22 Behaviour Records Created

Fields: Behaviour ID, Name, Description, Alternative names

Controller: Knowledge Areas Import

importKnowledgeAreasFromCSV(csvPath, category)

Called for each knowledge category CSV:

  • Household: Cooking, Shopping, Budgeting, Car/Garden/House Maintenance
  • Communication, Reading (book-related)
  • Society: Politics, Volunteering, Social skills
  • Well-being: Nutrition, Physical/Mental/Spiritual health
  • Travel: Language learning, Transport
  • Tech (minimal data)

Extracts knowledge domains and links to related skills

Database: Knowledge Areas Table

Multiple Knowledge Area Records Created

Fields: Knowledge Area ID, Category, Sub-category, Description, Related Skills (array)

Controller: Activities Import

importActivitiesFromCSV(csvPath, knowledgeAreaId)

Called for each knowledge area's activity CSV:

  • Parses activity ideas from CSV (e.g., Household-Cooking.csv has 40+ activities)
  • Creates Activity records linked to knowledge areas
  • Handles "Skills used" column via matchSkillNameToTaxonomy()
  • Stores progression level (1-3 scale)
  • Returns import summary with skill matching report
Helper: Skill Matching

matchSkillNameToTaxonomy(skillNameString)

  • Fuzzy matching function for linking activity "skills used" to Skills table
  • Handles variations in skill naming between CSVs
  • Returns matched Skill ID or null
  • Challenge: Skill names in activities may not exactly match Skills-Skills.csv
Database: Activities Table

100+ Activity Records Created

Fields: Activity ID, Knowledge Area ID (FK), Name, Description, Target Skills (array), Progression Level (1-3), Age suitability, Time/Location/Materials

Example: Cooking activities range from "Identify kitchen equipment" (Level 1) to "Cook a meal from a recipe book" (Level 3)

Controller: Generate Report

Seeding Summary Report

  • Total skills imported (64 expected)
  • Total behaviours imported (22 expected)
  • Total knowledge areas imported
  • Total activities imported
  • Skill matching success rate (percentage of activities with successfully linked skills)
  • Any errors or warnings
End

Database Seeded

Application ready with complete skills/behaviours taxonomy and activity library

CSV File Structure

Skills-Skills.csv

Columns:

  • [blank]
  • Category
  • Sub-category
  • Skill
  • Previous/alternative names
  • Explanation

Structure: Hierarchical (Category → Sub-category → Skill)

Behaviours-Table 1.csv

Columns:

  • Behaviours
  • Behaviour previously called
  • Explanation

Structure: Flat list (22 character traits)

Activity CSVs (e.g., Household-Cooking.csv)

Columns (vary):

  • [blank]
  • Category/Knowledge
  • Sub-category
  • Knowledge/Ideas for activities
  • Skills used
  • Level (1-3)

Note: Some CSVs have merged cells (implied hierarchy)

Implementation Considerations

One-time vs. Updateable

Question: Should this be a one-time seed or an ongoing/updatable dataset?

Options:

  • One-time: Seed during initial setup, data becomes static
  • Updatable: Allow periodic refresh when CSV data changes

Skill Name Matching

Challenge: The "Skills used" column in activity CSVs contains comma-separated skill references, but names may not exactly match Skills-Skills.csv

Solution: Implement fuzzy matching with alternative names

Fallback: Log unmatched skills for manual review

Tech Category Placeholder

Issue: Tech-Table 1.csv contains minimal data (12 bytes)

Options:

  • Leave empty until data supplied
  • Populate with placeholder activities

Merged Cells in CSVs

Challenge: Some CSVs have merged cells representing hierarchical relationships

Solution: Parser must forward-fill empty cells to maintain hierarchy