Postal Codes and research

Published on February 3, 2016

Given a recent thread on the Canadian Association of Geographers listserv I thought I’d put up a post about doing research with Postal Codes. For background, I recently worked at Statistics Canada, where I was responsible for the redevelopment of the Postal Code Conversion File + (PCCF+), which is the licensed (free to academics) product for geocoding Canadian Postal Codes.

There are a lot of issues with Postal Codes. (Note: Postal Codes are composed of an FSA + LDU in a H0H 0H0 format. FSA are public, LDU are proprietary with Canada Post, be warned!)

First, Postal Codes are technically not spatial. Postal Codes are developed for the efficient sorting and distribution of mail, full stop. In some cases a Postal Code (6-character) refers to a building (a point), in other cases a portion of a route walk (a line) or in other cases a delivery area (a polygon). Some of these cover very small geographic spaces (a single building) or large areas (several Census Subdivisions). Postal Codes (not FSA) can overlap, and there are differences between residential Postal Codes and commercial.

Postal Codes (denoted by an x) in Urban Areas

Even FAS will not correspond with other administrative boundaries. This is especially important outside urban areas, where we may not even know what municipality a particular FSA is associate with.

Forward Sortation Area (FSA) Boundaries compared to Census Subdivision (CSD) boundaries

Second, we need to treat Postal Codes carefully in order to reduce non-random spatial bias. In most cases we are interested in geocoding Postal Codes for either getting the representative latitude / longitude point, or for linking to administrative identifiers and supplementary variables. In urban areas (about 80% of the country) this is not too much of an issue. Postal Codes will map to 1 primary Dissemination Block (a single-link), and have only a few potential latitude / longitude possibilities, all of which are geographically close. However, in suburban (especially) and rural areas, Postal Codes can map to multiple administrative boundaries and have potential representative coordinates that are kilometres apart. These need to be treated with care. Most products (DMTI included) will provide a “best-fit” based on the location with the majority population. However, this excludes many areas completely from the analysis. In fact, thousands of Census Subdivisions would be excluded from analyses if only a single-link was used.

CSD where no individuals would be assigned using a PCCF single-linkage approach.

Third, Canada Post doesn’t particularly care that a wide variety of stakeholders use them for a wide variety purposes. (Local government planning, health human resources, provincial elections planning, epidemiological studies, cancer studies, other academic research, etc…) This is not in their business plan, and unless it is mandated upon them, it won’t be.

Fourth, there are several tools out there to work with them. These include the PCCF & PCCF+ from Statistics Canada. The PCCF is the best source for all Postal Code information in Canada, and with a lot of work you can use it in your analyses. The PCCF+ is a SAS program that conducts a population-weighted random assignment for Postal Codes where there are multiple matches. It also treats residential & institutional Postal Codes differently, links to many administrative codes, and include other supplementary codes.