Database considerations

Database and schema design best practices and recommendations.

You’ll need to store two different types of information, and may want to choose two different places to store them because the needs differ significantly:

  1. General information about each vaccine provider location

  2. Data about appointment availability at each location

General information about each vaccine provider location

It’s critical that this data be easy to update by DOH staff, technical staff, and other staff (program managers, etc.) working on your vaccine provider. DOH probably already has a list like this, but it may not be structured in a way that works well for programmatic access or may be missing important data you want to make available to residents. We recommend Airtable for this, but tools as simple as Google Sheets or as complex as Salesforce, ArcGIS, or other database systems can work, too. Whatever solution you choose, focus on balancing ease of editing by your staff with ease of access for software.

For schema design, specific fields we recommend including are:

  • ID: The unique identifier of the provider site. Our experience has been that no other commonly used identifier is likely to exist and be unique for all locations in this dataset. Recently-stood up vaccination centers may not be individually listed in your state’s IIS, some clinics with a single VTRCKS Pin may have multiple COVID-specific locations on the ground, etc.

  • [identifiers from other systems]: You’ll want to make sure that you can reliably match entries from this database to data from a variety of sources, and addresses are not stable or unique enough to do this job. Identify other types of identifiers that are helpful for matching and include them. For example:

    • State IIS Identifier

    • VTRCKS Pin

    • Store Type and Number (for pharmacies like CVS)

  • Name

  • Address

  • City

  • County

  • State

  • Zip code

  • Longitude/Latitude: If you don’t already have this information, you may be able to work with your state GIS Office or use commercial geocoding services like LocationIQ, Geocodio, or ArcGIS’s Geocoding API. Because all geocoding solutions can make mistakes, you’ll want to store the results here so they can be manually corrected as needed.

  • Operating Organization: For example, “CVS,” “XYZ Hospital Group,” or “state system.” You might keep more details about these organizations in another table and use this field as a reference.

  • Booking phone number

  • Informational phone number

  • Booking URL

  • Informational URL

  • Restrictions/eligibility requirements: This might include things like “county residents only,” “veterans only,” “age 65+,” etc. If possible, consider building a standard vocabulary and storing these as a list or array, rather than as free text, so your front-end can treat certain restrictions specially by giving them icons or showing them in certain spots on-screen.

  • Description

  • Internal Notes: You’ll want to have some fields that allow your team to manage their work or keep non-public notes and that are not published with the rest of the data.

You don’t necessarily need to be able to fill in all these fields — your VAF is fundamentally an attempt to get people as much useful information as you reasonably can in a limited timeframe. It’s not a perfect reference.

Data about appointment availability at each location

Unlike the general anecdotal information about locations, this database does not need to be easy for humans to edit. Instead, it should prioritize quick, automatic updates and historical data (you might store historic information about appointments separately, but you definitely should store it).

If your state has few provider locations (e.g. less than 75) or updates data no more often than every 5 minutes, a simple JSON file in a version control system (e.g. GitHub, GitLab, BitBucket) is an excellent solution. Setting this up is quick, extremely low cost, comes with history built-in, and scales extremely well when many residents load the data. However, it works less well as the number of provider locations increases or updates grow more frequent. When that’s the case, you’ll want to transition from Git to a more typical database, such as Postgres, MongoDB, etc.

In most cases, this database will also contain a copy of the information from the database that lists vaccine provider locations. You can use the same database for both (as long as it’s easy to edit for non-technical staff) or you might copy the data from one to the other once a day.

Last updated