Books: id.loc.gov
  Library of Congress linked data
  Already in relatively usable formats: XML, JSONLD, etc; further work would be dependent on type of analysis

Disease risk: data.gov.ie - large amounts of discharge data in Health/HSE section
  Something could be stitched together estimating risks

Universities: https://www.ucas.com/data-and-analysis/undergraduate-statistics-and-reports
  All CSVs in zips


