We partnered with individuals and businesses to come up with innovative ways to solve challenges we face as a central bank. Check out the results.
Innovative Data Cleaning
The Bank of Canada uses securities-related data from various external sources to monitor financial markets and perform associated research. To make this data usable, Bank staff must spend vast amounts of time manually “cleaning”—organizing and preparing—the information they receive.
The goals of this challenge were to discover innovative ways to clean data and to surpass current limitations by finding an automated solution.
The Bank partnered with Bronson Consulting Group to find a solution that would:
- reduce duplication of information
- automate the cleaning of the data
- improve the accuracy and robustness of results currently achieved in-house
Bronson was excited to work with the Bank on this challenge. The challenge of data cleaning is one shared by many organizations and we were confident we could help not only with the technology but through our approach to data readability and organization of large datasets.Phil Cormier, CA, CPA, Senior Consultant, Bronson Consulting Group
Led by Bronson’s Phil Cormier and supported by its president, Martin McGarry, Bronson staff began by reviewing the data provided by the Bank’s Financial Markets Department. The data involved a specific use case that required matching organizational names across three different datasets. Using the Alteryx Fuzzy matching tool as the matching engine, Bronson staff created workflows through Alteryx Designer to analyze, clean and standardize the data—which made linking the common fields across datasets easier.
After Bank and Bronson staff reviewed and discussed the initial results, we collectively focused on generating outputs that could be used to enhance the accuracy of record matching. A second review showed the potential for a scalable and robust solution that could eventually automate the Bank’s data-cleaning methods.
Once processed, the data was significantly cleaned and standardized, facilitating subsequent matching across the different datasets provided by the Bank.
Given that the Alteryx Fuzzy matching tool uses the same algorithm as the Bank's internal method (that is, Jaro-Winkler distance), Bronson expected their results to be similar to the Bank’s when using the same data. However, not only did Bronson staff generate comparable results, but they also showed that Alteryx Designer is a viable, robust and automatable solution. Using this tool could help the Bank reduce the number of hours that staff spend sorting through datasets to identify duplicated or misidentified information. Bronson provided a road map for the Bank to both solve and continuously automate its securities data.
Payment fraud detection
One of our core functions is to provide funds-management services to the Government of Canada, ourselves and other clients. This involves the secure inflow and outflow of payments. With this challenge, we wanted to see whether artificial intelligence and machine learning could:
- enhance our existing processes
- learn from historical transactions
- build models of normal transactional behaviour
- monitor transactions and identify atypical activity
In addition to the original scope—and to demonstrate preliminary capabilities in supporting payment screening—we also reviewed free format text within our payment activity to identify and match names, entities, countries and key terms against high-risk and sanctions lists.
Over a six-month period, we collaborated with MindBridge, an Ottawa-based artificial-intelligence firm. Our first step was to host a workshop to establish a baseline of typical payment activity across our payment systems—based on clients, dollar amounts, payment channels and the associated risk of financial crimes. This granular view of typical payment activity allowed MindBridge to glean potential applications for artificial intelligence and machine learning.
Partnerships like this are critical in advancing innovations in the payments sector and boosting Canada’s strength in machine learning and artificial intelligence.Nima Anvari, Data Scientist, MindBridge
With the goal of using its artificial-intelligence and machine-learning capabilities to assess our payment activity for abnormal and atypical transaction patterns, MindBridge set out to test and tune its algorithms. To this end, we compiled seven months of payment data, which we reformatted, parsed and masked. The data were then exported to MindBridge’s analytics environment.
Following several review and revision cycles, MindBridge presented its results and supporting analysis.
MindBridge’s system was able to identify known abnormalities as well as previously unknown and undetected transactional attributes and patterns, giving us a new perspective on our clients’ activities.
For MindBridge, the experiment proved that it could identify atypical payments using artificial intelligence and machine learning. As this project was experimental in nature, additional and substantive exploration and investigation would be needed to implement an artificial intelligence and machine-learning system as a control to support monitoring of financial crimes.
This experiment supported our ongoing efforts to understand the potential application of artificial intelligence and machine learning to enhance our work. We learned that to become a viable system for payments monitoring, an end-state solution must include the ability to integrate with our payment processes and workflows. Additionally, it would need to support real-time identification and blocking of atypical payment activity.
Categorizing sensitive data
Protecting our data is paramount. Our staff must therefore categorize their emails, documents and other corporate records properly. This can be tedious, and the manual process is prone to error. We wanted to see whether machine learning and artificial intelligence could help employees categorize information.
The challenge was to develop an algorithm that could differentiate sensitive documents from non-sensitive documents.
We partnered with PigeonLine, an artificial-intelligence firm that specializes in developing accessible, private and interactive technological solutions to everyday problems.
Several important questions guided our work:
- How can we build an accurate algorithm from labelled test data?
- Will users understand why an algorithm made certain choices?
- Can subject-matter experts train the algorithm?
- What if trainers disagree?
- Can we train the algorithm to capture more complex information?
The experiment was planned in two stages. In stage one, we built a machine-learning model from sample data. We collected 1,000 sample documents with different formats and quickly sorted them into two categories—sensitive and non-sensitive.
PigeonLine used a subset of the documents to train a machine-learning model. A Multinomial Naive Bayes model was selected because it offered some explanation of its results. This was useful in stage two. PigeonLine used the remaining documents to test the accuracy of the model.
In stage two, one of our experts was given a user interface to inspect the machine-learning model’s results, correct mistakes and identify additional explanations, such as context based on words and phrases in the documents. Our expert’s modifications were used to create a second machine-learning model, and accuracy was measured again.
This project demonstrated that a machine-learning model could be used to automatically categorize sensitive data, achieving 80 percent accuracy on our limited dataset. We were able to increase the accuracy to 83 percent using corrections from a subject-matter expert. Machine-learning solutions proceed iteratively and require more expert training from multiple experts. The approach promises increased accuracy with further research.
More work is needed to understand how expert knowledge and document-level explanations improve the model. If successful, the algorithm could automatically categorize data for all employees, saving time and reducing error. It could have an important role in cyber security and safeguarding the Bank’s sensitive data.
Household spending measures
Household spending plays a key role in Canada’s economy. Policy measures that affect Canadian households, including interest rate decisions, are regularly introduced at both the national and the regional level. It’s important that we, as a central bank, understand how these measures influence growth in household spending.
Understanding fluctuations in economic activity is central to our day-to-day operations. The main problem we face is the slow speed at which traditional economic data are published.
The PIVOT project gave me hands-on experience of the highs and lows of academic research through the meticulous development of the data and models of household consumption.PIVOT participant John Baker, a PhD candidate at the University of Waterloo
With this challenge, we wanted to explore ways to assess, in real time, how Canadians respond to Bank of Canada policy decisions—for example, what’s the impact of our interest rate decisions during a period of elevated debt levels? Analyzing these high-frequency data would make Canadian economic analysis more timely, support our data-driven policy and ultimately help us fulfill our mandate.
We partnered with John Baker, a PhD candidate at the University of Waterloo. A first step was to evaluate approaches that would enable us to gather, process and analyze high-frequency data as they become available. We wondered if we could achieve our desired outcome by analyzing search queries—the string of characters a user types into a search engine such as Google.
Over the course of the project, we discovered how to leverage search-query data to assess the impact of policy measures on household spending in real time. Mr. Baker conducted the experiment at the University of Waterloo, supervised by faculty in the Department of Economics. Experts from the Bank guided him on technical aspects and the goals of the project.
Innovation and risk taking are best achieved when both partners engage in open and honest knowledge sharing.Daniel de Munnik, Director, Real Economic Activity, Canadian Economic Analysis
The event study showcased search-query results following the Bank’s fixed announcement dates. These data provided us with a unique look at how Canadians respond to a variety of policy announcements and revealed a potential new way to analyze the impact of central bank communications.
The main empirical experiment showed possible benefits of incorporating alternative data sources into conventional forecasting methods. In this case, our models of household spending improved when we combined search-query data with traditional data sources.
For Mr. Baker, the experiment supported the research component of his doctorate and gave him experience analyzing data and developing models of household consumption. Also, his work with central bank economists allowed him to see the challenges they face in providing timely policy advice to the Bank’s Governing Council.
Cyber security analytics
In a continuously evolving technology and cyber landscape, the Bank of Canada’s Cyber Defence Centre must quickly and accurately identify relevant patterns based on diverse data feeds, like context data and activity data—for example, firewall and proxy logs, operating system logs, and intrusion-prevention system alerts.
With this challenge, we wanted to experiment with new approaches to analytics methodologies—with a special interest in machine learning—to help detect potential cyber security threats to the Bank.
The goal was to enhance our analysis by coming up with innovative ways to detect malicious activities that may be missed using more traditional tools or current alerting platforms.
The PIVOT program enables innovation and allows participants to leverage new ideas to overcome new challenges. In our case, the experiment taught us the importance of having a wide variety of data sources and ensuring the quality of the collected data used to train machine-learning models.Félix Trépanier, Security Architecture Research and Testing Analyst, Information Technology Services
We took on this challenge with Ottawa-based cyber security firm C3SA Cyber Security Audit CORP. The project had three phases:
- We integrated a platform capable of monitoring data from across the Bank’s information technology (IT) infrastructure. This platform uses machine learning to ingest and interpret high volumes of IT and security data in real time.
- We analyzed the data, covering several security themes.
- We developed models to detect anomalies in our network. During this phase, we also evaluated how well machine learning detected simulated or non-simulated cyber incidents. This enabled us to more effectively perform tests on our network.
Throughout the project, we tested more than 25 machine-learning models, some of which continue to monitor our infrastructure and alert us in real time to any issues.
Be ambitious. Take the many opportunities provided through the PIVOT program to drive excellence in your project and outcomes.Dr. Jeff Schwartzentruber, C3SA Cyber Security Audit Corporation
This challenge unveiled the potential of using machine learning as a supplementary tool to detect anomalies in our infrastructure and mitigate the risks of cyber incidents.
We showed how analyzing large amounts of data can significantly enhance our traditional, alert-driven approach to cyber security monitoring. By leveraging these data effectively, we’re able to increase the resilience and security of our IT infrastructure.
French language evaluation
Many employees at the Bank are working at improving or maintaining their second official language to meet proficiency requirements. With this challenge, we wanted to test whether artificial intelligence could help these language learners assess how they are doing between official evaluations. If successful, a similar tool could also help job candidates informally gauge their language skills before applying for the roles that interest them.
We envisioned a tool that could return the same language-proficiency score as an evaluator. But we knew it wouldn’t be easy, since accurate language evaluation requires nuanced judgment of multiple elements, such as structure, tone and pronunciation.
With our objectives to modernize second-language services and to advance and promote bilingualism at the Bank, we thought that PIVOT could be a great opportunity for us to see if artificial intelligence and machine learning could be applied to support employees in their learning goals.Marie-Claude Decobert-Delcourt, French-Language Teacher, Human Resources
Over five months, we collaborated with Silver—an Edmonton-based technology company that specializes in creating voice products and services—to build an artificial-intelligence tool.
We began by gathering data, examining the existing process for evaluations and developing the initial prototype. To build a proof of concept that would mimic the interaction, the Silver team observed evaluations to fully experience the process. They worked closely with the human resources team to better understand the perspectives of both the evaluator and the participant.
We then “trained” the first proof of concept by using the voices of 20 volunteers, who also gave us feedback on their experiences with the tool.
Finally, in May 2019, we launched the revamped version, called SilverFR, and encouraged employees from across the Bank to try the prototype. Roughly 100 employees of different levels of language proficiency volunteered.
PIVOT is about experimentation. The Bank is offering you an environment to play in, so don’t be afraid to push your ideas and boundaries.
With this tool, the Bank now has a prototype that, with further work and development, could be used by employees who are actively learning French as a second language. And Silver now has a product that we can build on further.”Shawn Kanungo, Founder and CEO, Silver
Results presented by Silver in June 2019 revealed that, for most skill levels, the tool could broadly determine an individual’s French proficiency more than 60 percent of the time. We estimate that an expanded sample size would improve on this performance.
Silver came into the challenge with no experience in language evaluation and faced creating a tool that did not exist in this field. The company will be able to use the knowledge and skills gained from collaborating with the Bank in its work with other clients and consumers.
For the Bank, success was based on innovation and efficiency. We learned about the potential of artificial intelligence and the value of innovative technologies in meeting challenges that don’t have a readily available commercial solution. The project will be a great foundation for doing future experiments at the Bank.