How we migrated Project Segfault’s matrix homeserver to matrix-docker-ansible-deploy

Mon, 17 April 2023

Yesterday, we completed Project Segfault’s migration from matrix.org’s official docker image for synapse to matrix-docker-ansible-deploy.

This was because of how much of a pain it is to setup workers, especially with docker. The docs aren’t great about it either..

For these reasons, we turned to matrix-docker-ansible-deploy.

The first issue we encountered was how spread out the docs were, though very precise and well-explained.

Once we cloned the repo, we first had to setup the inventory hosts file.

Since we cloned the repo to the DockerVM itself, we had a weird solution for this.

[matrix_servers]
matrix.projectsegfau.lt ansible_host=localhost ansible_ssh_user=root

After this, we had to add the pubkey of the VM to its own authorized_keys. Wacky :P

With that sorted, we had to start configuring.

Firstly, we had to prevent it from installing docker.

This is important since (re)installing docker will break a lot, especially for our pre-existing services.

matrix_playbook_docker_installation_enabled: false

After that, we had to add our old secret keys back to the config file so that it won’t break federation:

matrix_synapse_macaroon_secret_key: "xxx"
matrix_synapse_registration_shared_secret: "xxx"
matrix_synapse_form_secret: "xxx"

The signing key had to be re-added as well, but later after the setup was complete.

After that, we turned to the thing we migrated for, synapse workers:

Since the generic and federation_sender workers have to process a lot of data, we made 4 of each (totally didn’t copy the number from envs.net :P).

matrix_synapse_workers_enabled: true
matrix_synapse_workers_preset: one-of-each
matrix_synapse_workers_federation_sender_count: 4
matrix_synapse_workers_generic_worker_count: 4

Another important thing we had to take into consideration was postgres. We ran postgres on a separate VM and connected to the database on it.

matrix_synapse_database_host: "192.168.5.4"
matrix_synapse_database_user: "synapse"
matrix_synapse_database_password: "xxx"
matrix_synapse_database_database: "synapse"
devture_postgres_enabled: false

After the database, we had to set up registration/login stuff.

A weird thing I noticed about matrix-docker-ansible-deploy’s email configuration is that it uses its own relay, above our mail credentials.

matrix_mailer_sender_address: "matrix@projectsegfau.lt"
matrix_mailer_relay_use: true
matrix_mailer_relay_host_name: "mail.projectsegfau.lt"
matrix_mailer_relay_host_port: 587
matrix_mailer_relay_auth: true
matrix_mailer_relay_auth_username: "matrix@projectsegfau.lt"
matrix_mailer_relay_auth_password: "xxx"
matrix_synapse_registrations_require_3pid: [ email ]
matrix_synapse_enable_registration: true
matrix_synapse_configuration_extension_yaml: |
  oidc_providers:
    - idp_id: authentik
      idp_name: "authentik"
      idp_icon: "mxc://envs.net/429bd4b307d32b919a94823f03acc7c24a7da61f"
      discover: true
      issuer: "https://auth.p.projectsegfau.lt/application/o/matrix/"
      client_id: "xxx"
      client_secret: "xxx"
      scopes:
        - "openid"
        - "profile"
        - "email"
      user_mapping_provider:
        config:
          localpart_template: "{% raw %}{{ user.preferred_username }}{% endraw %}"
          display_name_template: "{% raw%}{{ user.name }}{% endraw %}"
          email_template: "{% raw %}{{ user.email }}{% endraw %}"

Past this, we also had to port the small configurations we had in our old homeserver.yaml to the ansible format.

Since most of these weren’t documented very well, we had to make heavy use of the defaults file.

matrix_synapse_auto_join_rooms: [ '#project-segfault:projectsegfau.lt', '#support:projectsegfau.lt', '#general:projectsegfau.lt', '#announcements:projectsegfau.lt' ]
matrix_synapse_max_upload_size_mb: 700
matrix_synapse_allow_public_rooms_without_auth: true
matrix_synapse_allow_public_rooms_over_federation: true
matrix_synapse_email_client_base_url: "https://matrix.to"
matrix_synapse_email_invite_client_location: "https://chat.projectsegfau.lt"
matrix_synapse_turn_uris: ["turn:turn.projectsegfau.lt?transport=udp", "turn:turn.projectsegfau.lt?transport=tcp"]
matrix_synapse_turn_shared_secret: "xxx"
matrix_synapse_turn_allow_guests: true
matrix_coturn_enabled: false
matrix_client_element_enabled: false

At this point we realized that we need to do a lot of weirder stuff to get it to work reverse-proxied behind our main caddy instance.

We reverse-proxied the traefik instance behind our caddy instance, as recommended by the documentation with the instructions there:

# Ensure that public urls use https
matrix_playbook_ssl_enabled: true

# Disable the web-secure (port 443) endpoint, which also disables SSL certificate retrieval
devture_traefik_config_entrypoint_web_secure_enabled: false

# If your reverse-proxy runs on another machine, consider using `0.0.0.0:81`, just `81` or `SOME_IP_ADDRESS_OF_THIS_MACHINE:81`
devture_traefik_container_web_host_bind_port: '0.0.0.0:81'

# We bind to `127.0.0.1` by default (see above), so trusting `X-Forwarded-*` headers from
# a reverse-proxy running on the local machine is safe enough.
devture_traefik_config_entrypoint_web_forwardedHeaders_insecure: true
devture_traefik_additional_entrypoints_auto:
  - name: matrix-federation
    port: 8449
    host_bind_port: '0.0.0.0:8449'
    config: {}

After all the configuration was done, we had to run it :P.

Firstly, we had to install ansible and just, and run just roles to initialize all the ansible stuff.

At this point, we shut down our old matrix instance in order to not cause any issues.

Then, we ran ansible-playbook -i inventory/hosts setup.yml --tags=install-all to install all the files but not start the services.

Now came the most time consuming part, importing the old media repo. Considering its size at over 85 gigabytes.

ansible-playbook -i inventory/hosts setup.yml --extra-vars='server_path_media_store=/opt/docker/mtrx/files/media_store' --tags=import-synapse-media-store

This took almost 30 minutes, the majority of the downtime we had..

After this was done, we were able to start the server: ansible-playbook -i inventory/hosts setup.yml --tags=start.

The nginx configuration they recommended in the documentation for our reverse-proxy setup was pretty self-explanatory and easy to convert, but for the fact that till now our matrix instance used normal delegation and did not make use of :8448.

Due to this, we had to waste a lot of time trying to figure out which routes went to which ports. I wish the documentation explained this better..

At the end, this was the caddy configuration we came up with for this:

matrix.projectsegfau.lt {
    reverse_proxy /_matrix/* 192.168.5.2:8449
    reverse_proxy /_matrix/client/* 192.168.5.2:81
    reverse_proxy /_synapse/* 192.168.5.2:81
}

This configuration works right now, though we are still not completely sure if other routes need to go somewhere else.

I do have some gripes with it though, such as the ages it takes for restarts (–tags=setup-build and then –tags=restart for those wondering) and the lack of documentation for what is the recommended upstream delegation configuration.

At the end, matrix-docker-ansible-deploy simplified our config a lot and relieved a lot of maintanence burden we would have had in case we configured it manually and I am thankful for that.