Back to the Data Mine

The census is a waste of resources that produces unreliable information. Why put Canadians through the process?

Graphic by Paul Kim
Graphic by Paul Kim

Your census arrived in the mail, and having seen the commercials you were excited to fill it out, do your civic duty, and contribute to the understanding of our ever-evolving communities. You might have even been one of the keen folks so consumed by the notion of filling out a government mandated survey that you helped crash the Statistics Canada website.

But as someone with an MSc in Applied Statistics who works with demographic and behavioural data all day every day, I struggle to understand the purpose and efficiency of the census. It has been well established by Statscan that revenue data collected is unreliable. Statscan has had access to Revenue Canada data since 1985, and recently decided to start using data from Revenue Canada instead of asking a question about income on the census. I imagine an investigation revealed that people have a tendency to underreport their income on the census: it’s much easier to remember your salary versus your salary plus benefits, bonuses, etc. So this year, for the first time in census history, Statscan will be using your tax information from Revenue Canada. They “anticipate that it’s going to be the highest quality income data that we’ve ever had on the census,” mostly because it’s not actually from the census at all. It only took thirty years to figure that one out.

If you’re following this logic, before the 2016 census, not only was the government asking you for information it already had, but it seems to have spent time and resources (tax dollars) checking to see if the information you’re legally required to provide matches the information they already know to be true. Much of what is asked on the census is information already housed in government records: data you’re already required to provide to the government in the form of birth certificates, death certificates, marriage licenses, immigration records, property taxes, travel records attached to your passport, and many other documents.

Given that the government already has access to much of the information, consider the amount of time Canadians are spending to fill out the census. It took me about ten minutes to complete my short form census—once the website was back up and running, of course. I’m likely on the lower end of the time scale as there are only two people living under our roof and we have a pretty simple situation. So let’s estimate an average of fifteen minutes for the short form, which is sent to three quarters of Canadian households. The remaining receives the long form version.

To complete the short form census at a national level would take approximately 2.5 million hours of work, which equates to one person working forty-hour work weeks for 1202 years. This doesn’t include the hours involved in preparing the questions, vetting the questions with every Tom, Dick, and Sally who want to be involved, and setting up the ill equipped website. All this effort, only to end up with a set of questions that looks one heck of a lot like the 2011 National Household Survey. As was published in the Canada Gazette, “The questions for the 2016 Census of Population are the same as the 2011 Census of Population. Minor modifications have been made to the instructions.” No questions have been changed, except for unneeded revenue questions removed, and a question on religion doesn’t appear as it’s asked only once every ten years).

The physical and financial resources required for the census are also staggering. 11.4 million households received two sheets of paper (unless you request a physical copy of the questionnaire) but 3.8 million households received the long version: thirty-six-pages of English plus thirty-six-pages of French census questions, an eight-page guide, a letter of introduction, and a return envelope. You then likely went online to complete the survey like a normal human of the twenty-first century and tossed all this paper in the trash. Facilitating your census submission are some 35,000 temporary employees hired specifically for the task. The overall estimate for conducting the 2016 census is a $715 million.

Unfortunately the total cost of the census will remain an estimate until two years after it has been completed. Statscan has outlined multiple release dates for the data collected, the earliest of which is February 2017—nine months after the forms are sent in. The latest release is scheduled for that November, which even at eighteen months will be the fastest turnaround in census history. What is being released is simple descriptive statistics, no analysis is being conducted, no conclusions are drawn—so why the delay? By the time all of the data from the 2016 census is released, it will already be a year and a half out of date.

Although I do believe that wasted resources, data reliability, and turnaround are pivotal to the census conversation, there is also an argument that can be made for data privacy concerns. In an age where a nineteen-year-old computer science student can gain access to private government records from the comfort of their home, it is worrying to know that so much personal information is being housed at all. You may be comforted to know (though I don’t find it particularly comforting myself) that everyone who works at Statscan must sign an Oath of Secrecy in compliance with the Statistics Act. The archaic sounding vow equates to a confidentiality agreement but instead of getting fired, the punishment for data security breaches can include fines and imprisonment. Maybe the lack of a breach of security of personal information housed at Statscan is because stringent security practices are in place, or maybe it’s because no one outside of Statscan and government agencies sees any value in the data available.

What has been created in decades of costly census data acquisition is really just the Government of Canada’s most out of date, and resource intensive database of unreliable data. If the government could take it upon themselves to restructure the way in which data is housed, processed, and shared, we likely would no longer have a need for a census at all.

When the tan envelope arrived in my mailbox, legally I was bound to go online and complete it. But in going through the motions of filling out my survey on behalf of my household I felt no joy, no sense of civic pride, and no faith that my answers are serving any kind of greater purpose.

Jess Connolly
Jess Connolly works delivering data driven marketing initiatives in Calgary, Alberta.