Description: This is a bulk open-access dataset in JSON, parquet and Hugging Face dataset formats with the full text of Canadian Human Rights Tribunal (CHRT) decisions. The process through which data is processed and code snippets for loading the data are available in a repository on the Refugee Law Lab GitHub.
Data: https://github.com/Refugee-Law-Lab/chrt_bulk_data/blob/master/DATA/yearly
Code Repository: https://github.com/Refugee-Law-Lab/chrt_bulk_data
Current Coverage: 2003-Present
Number of Decisions: ~1,600
Languages: English & French
Format: JSON, Parquet, Hugging Face Dataset
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Citation: Sean Rehaag, “CHRT Bulk Decisions Dataset” (2023), online: Refugee Law Laboratory https://refugeelab.ca/bulk-data/chrt
Programmatic Access in Python (JSON via GitHub):
import pandas as pd
import requests
import json
start_year = 2003 # First year of data sought (2003+)
end_year = 2023 # Last year of data sought (2023 -)
base_ulr = 'https://raw.githubusercontent.com/Refugee-Law-Lab/chrt_bulk_data/master/DATA/YEARLY/'
results = []
for year in range(start_year, end_year+1):
url = base_ulr + f'{year}.json'
results.extend(requests.get(url).json())
df = pd.DataFrame(results)
df