Загрузка...

HATHI 1M: Introducing a Million Page Historical Prose Dataset in English from the Hathi Trust

We present a new dataset built on prior work consisting of 1,671,370 randomly sampled pages of English-language prose roughly divided between modes of fictional and non-fictional writing and published between the years 1800 and 2000. In addition to focusing on the “page’’ as the basic bibliographic...

Полное описание

Сохранить в:
Библиографические подробности
Главные авторы: Sunyam Bagga, Andrew Piper
Формат: Artigo
Язык:Inglês
Опубликовано: Ubiquity Press 2022-03-01
Серии:Journal of Open Humanities Data
Предметы:
Online-ссылка:https://openhumanitiesdata.metajnl.com/articles/71
Метки: Добавить метку
Нет меток, Требуется 1-ая метка записи!