Carregant...

HATHI 1M: Introducing a Million Page Historical Prose Dataset in English from the Hathi Trust

We present a new dataset built on prior work consisting of 1,671,370 randomly sampled pages of English-language prose roughly divided between modes of fictional and non-fictional writing and published between the years 1800 and 2000. In addition to focusing on the “page’’ as the basic bibliographic...

Descripció completa

Guardat en:
Dades bibliogràfiques
Autors principals: Sunyam Bagga, Andrew Piper
Format: Artigo
Idioma:Inglês
Publicat: Ubiquity Press 2022-03-01
Col·lecció:Journal of Open Humanities Data
Matèries:
Accés en línia:https://openhumanitiesdata.metajnl.com/articles/71
Etiquetes: Afegir etiqueta
Sense etiquetes, Sigues el primer a etiquetar aquest registre!