Migrating v1 Primero to v2 Primero

Ronald Elledge

June 10, 2021

About Primero

- A case and incident management tool used by UNICEF and their partnering agencies

https://www.unicef.org/eca/what-we-do/our-partners
These different partner agencies have different needs and requirements

- Used to track and assist children in areas of crisis and conflict

- Used to track and report on incidents of violence

- Replaces old method of using lots of paper forms

- Used in situations like refugees fleeing war zones in Iraq or Syria, natural disasters in Indonesia, the Ebola outbreak in Sierra Leone

- Highly configurable to satisfy the many different requirements, the many different forms, and many different situations in the different implementations

About Primero - Modules

3 main modules:

CP - Child Protection: Assisting children in need of services
GBV - Gender-based Violence: Reporting on instances of violence such as assault and providing services
MRM - Monitoring and Reporting Mechanism: Reporting on bad people doing really bad things… like war crime level activity

- A Primero instance can be configured to use one or more of these modules

- Usually an instance is CP or GBV or CP & GBV or MRM

- CP is the most common usage

- MRM is by far less common and so far has not been within the scope of these migrations

- Extremely sensitive data

About Primero - Record Types

3 main record types:

Case: Manages an individual (usually a child) and the services they may receive

Medical Services
Therapy/Counselling Services
Legal Services
Food/Nutrition Services
Housing Services
Financial Services
Protection Services
Reunification Services

Incident: Reports on an instance of violence or abuse

Tracing Request/Trace: Used to help families locate missing children or to help separated children find their families

About Primero - Technical History

- Began as an open source project by grad students in 2010

- Ruby on Rails

When Quoin inherited the application, we upgraded from Rails 3 to Rails 4
A few years ago we upgraded again to Rails 5
v2 is still Rails 5

- Version 1 (v1) uses a noSQL database Couchdb

JSON file based system
Very few fields defined within the data models. Most of the fields are defined by the form definitions
Provides flexibility to configure and later modify the various forms
The UI allows some administrative users to modify the forms

- Quoin took over Primero in 2014 and began maintaining it

v1.0.0.1 tag created on April 22, 2014

- Since then, Quoin has made maintenance fixes and added enhancements to satisfy new requirements

Versions v1.1 through v1.8
v1.8 is mostly a copy of v1.7 with special modifications for Indonesia. Not really being maintained at the moment.
The main v1 version still being supported is v1.7 though we still have some older versions in production
v1.7.35 tag created on May 28, 2021

- Still open source. Quoin maintains repositories in Bitbucket. The application repo is synced with a GitHub repo.

- Other folks still submit changes to the GitHub repo which we review and merge

- The configuration repositories are not in GitHub

- Time for an overhaul

It is getting some age on it. There is newer technology now, new toys.
It has some design elements that are less than ideal
Like any application that has enhancements over time, it has some baggage

Major Changes from v1 to v2

- v1: CouchDB (NoSQL database) v2: PostgreSQL

PostgreSQL has a jsonb field type which we can use to store the configurable data for a record
jsonb fields allows us to change configurable data within it without having to change the application and migrate the database
Using PostgreSQL gives us more structure and provides for some of the data model association features we really couldn’t use with CouchDB

- Javascript React front end with API back end

- Can integrate with an Identity Provider to handle user authentication

This can be turned on or off

- v1: vagrant VM’s v2: docker containers

- v1: had an accompanying Android mobile app v2: no such app

- v1: deployed using chef v2: deployed using ansible

- Some nested elements in the main record data in v1 were pulled out into separate tables/models in v2

Alerts
Attachments
Flags
Record History
Transfer/Referral information
Traces

- Some field/attribute names changed

- New dashboard permissions

- Application Translation String ymls were cleaned up

Elements of the Migration Process

Configuration
Users
Record Data
Application Translation Strings (i18n stuff)

Versions Migrated

Our migration migrates a v1.7 instance to a v2 instance
If an instance is older than v1.7, it should first be migrated up to v1.7 before this migration is run. So far, we have not had cases of this.

Configuration

- Implementation specific configurable data

Modules
Forms
Reports
Roles/permissions
Locations
Agencies / Agency Logos
System level configuration

- In v1 can be imported/exported as a json file

- Can be imported as Ruby record creation scripts

- Configuration is stored in a separate repository from the application

- We have configurations for each implementation (Jordan, Iraq, Bangladesh, Syria, etc) in this repo.

- We have a repository for v1 configurations and a repository for v2 configurations

None of these are synced with repositories in GitHub

Migrating the Configuration

Export JSON config from v1 production
Import the JSON config on a v1 integration server
Run the export_configuration scripts on the v1 server to create v2 compatible ruby config seed scripts. (The scripts extract the agency logos and create separate scripts to load the logos as file attachments.)
Copy the generated seed scripts to a new configuration repository for this v2 implementation
Deploy the v2 application and configuration to the target v2 server
Test the v2 application comparing its configuration against the v1 application
Make modifications/fixes to the seed files in the v2 configuration repo, redeploy, repeat as needed

Users

Changes from v1 to v2

Now require an email
Additional validations necessary to ensure we can integrate with the Identity Provider

invalid_users_report script

Before the migration, run on a v1 server loaded with production user data to identify users that have data issues that will cause them to be invalid in v2
Generates csv files we can put in a spreadsheet to send to the users to correct before we run production migration

Migrating Users

- Load the users on v1 integration server

Copy in couchdb user data files

- Run the export_user scripts on the v1 server to create v2 compatible ruby user seed scripts.

- Copy the generated user seed scripts and an import_user script to the target v2 server

- Copy the seed scripts and import_user script to the application docker container

- Run import_user script.

It loops through executing each user seed script

Record Data

- Cases, Incident, Tracing Requests

- Some elements which were embedded in the main record data in v1 were pulled out into separate tables/models in v2

Alerts
Attachments
Flags
Record History
Transfers/Referrals
Traces

- Some fields were renamed / refactored

Migrating Record Data

Load the record data on v1 integration server. (Copy in couchdb user data files)
If testing the production data run, anonymize the data.
We have a data anonymization script that sanitizes sensitive personal data (names, addresses, phone numbers, id’s, etc)
Run the export_data script on the v1 server to create v2 compatible ruby record data migration scripts.
Copy the generated record data migration scripts and import_data script to the target v2 server
Copy the record data scripts and import_data script to the application docker container
Run import_data script. (It loops through executing each data migration script.)

Migrating Record Data - implementation-specific

There is some record data that needs special migration depending on the particular migration
We have separate scripts that we store in the configuration repo
Copy these scripts to the target v2 server
Copy the scripts to the application docker container
Run the scripts

Finishing Up: Still in the application docker container, reindex solr

Internationalization

Form/field translations

The form and field translations are migrated over as part of the configuration migration. Nothing really to do here… all good.

Application string translations

More of a manual process
Created new v2 project in Transifex to keep v2 translations separate from v1 translations
Currently working with the users to resolve a few outstanding issues and questions
ar or ar-IQ … ku or ku-IQ

Testing...

- Test with users of each role/type/locale

- For each,

compare dashboard totals with v1
Compare case list totals with v1
Compare incident list totals with v1
Compare tracing request list totals with v1
Compare functionality / permissions
Compare translations

- Issues are logged in an issues spreadsheet

- Developers research the issue and provide an explanation or create a ticket to fix

Application issue?
Migration script issue?
Configuration issue?
Migration process issue? (Maybe developer forgot to clear out old test data before importing new migrated data)
Known application behavior change from v1 to v2?
Other?

Implementations currently being migrated

Jordan TSFV (Tracking System for Family Violence)
Iraq - CP
Iraq - GBV

Takeaways

Data is messy
Have a well documented repeatable migration process
Have good READMEs
Our contract partners have given feedback which I’ve incorporated into the READMEs
Script as much as you can
Keep implementation specific scripts separate from the main generic migration scripts
Because data is messy, you can’t script everything. You can continue to massage and tweak the scripts as you find new things, but some issues may have to be handled manually
You will run into new issues with each new migration. Count on it!

Platform MigrationHumanitarian Services