Datasets ▶ IA Controlled Digital Lending [ia]
If you are interested in mirroring this dataset for archival or LLM training purposes, please contact us.
Overview from datasets page.
Source | Metadata | Files |
---|---|---|
IA Controlled Digital Lending [ia] |
✅ Some metadata available through Open Library database dumps, but those don’t cover the entire IA collection
❌ No easily accessible metadata dumps available for their entire collection
👩💻 Anna’s Archive manages a collection of IA metadata
|
❌ Files only available for borrowing on a limited basis, with various access restrictions
👩💻 Anna’s Archive manages a collection of IA files
|
This dataset is closely related to the Open Library dataset. It contains a scrape of all metadata and a large portion of files from the IA’s Controlled Digital Lending Library. Updates get released in the Anna’s Archive Containers format.
These records are being referred to directly from the Open Library dataset, but also contains records that are not in Open Library. We also have a number of data files scraped by community members over the years.
The collection consists of two parts. You need both parts to get all data (except superseded torrents, which are crossed out on the torrents page).
- ia: our first release, before we standardized on the Anna’s Archive Containers (AAC) format. Contains metadata (as json and xml), pdfs (from acsm and lcpdf digital lending systems), and cover thumbnails.
- ia2: incremental new releases, using AAC. Only contains metadata with timestamps after 2023-01-01, since the rest is covered already by “ia”. Also all pdf files, this time from the acsm and “bookreader” (IA’s web reader) lending systems. Despite the name not being exactly right, we still populate bookreader files into the ia2_acsmpdf_files collection, since they are mutually exclusive.
Resources
- Total files: 10,463,656
- Total filesize: 317.6 TB
- Files mirrored by Anna’s Archive: 10,135,270 (96.862%)
- Last updated: 2024-11-05
- Torrents by Anna’s Archive
- Example record on Anna’s Archive
- Main IA Controlled Digital Lending website
- Digital Lending Library
- Metadata documentation (most fields)
- Scripts for importing metadata
- Anna’s Archive Containers format