Description: This is a bulk open-access dataset in JSON format with the full text of Supreme Court of Canada decisions. The process through which data is processed and code snippets for loading the data are available in a repository on the Refugee Law Lab Github.
Data: https://github.com/Refugee-Law-Lab/scc_bulk_data/tree/master/DATA/YEARLY
Code Repository: https://github.com/Refugee-Law-Lab/scc_bulk_data
Current Coverage: 1877 – 2022
Number of Decisions: ~15,500
Languages: English & French
Format: JSON (yearly files)
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Citation: Sean Rehaag, “Supreme Court of Canada Bulk Decisions Dataset” (2023), online: Refugee Law Laboratory https://refugeelab.ca/bulk-data/scc
Programmatic Access in Python:
import pandas as pd import json import requests # Set variables start_year = 1877 # First year of data sought (1877 +) end_year = 2022 # Last year of data sought (2022 -) language = None # language of cases sought ('en', 'fr', or None for both) # load data base_ulr = 'https://raw.githubusercontent.com/Refugee-Law-Lab/scc_bulk_data/master/DATA/YEARLY/' results = [] for year in range(start_year, end_year+1): url = base_ulr + f'{year}.json' results.extend(requests.get(url).json()) # convert to dataframe df = pd.DataFrame(results) # filter by language if applicable if language: df = df[df['language'] == language]
NOTES:
(1) Data Source: Supreme Court of Canada.
(2) Unofficial Data: The data are unofficial reproductions of materials on the Supreme Court of Canada website. Links to official versions are included in the dataset.
(3) Non-Affiliation / Endorsement: The data has been collected and reproduced without any affiliation or endorsement from the Supreme Court of Canada
(4) Non-Commerical Use: As indicated in the license, data may be used for non-commercial use (with attribution) only. For commercial use, see the Supreme Court of Canada website’s Terms of Use.
(5) Accuracy: Data was collected and processed programmatically for the purposes of academic research. While we make best efforts to ensure accuracy, data gathering of this kind inevitably involves errors. As such the data should be viewed as preliminary information aimed to prompt further research and discussion, rather than as definitive information.