Curating the world’s largest biodiversity dataset for AI