DTU skills training on ‘Database Structures’
16-17 November, 2017 (DHH-ST-03)
Prof. Dr. Martin Theobald, Dr. Robert C. Kahlert (KU Leuven)
What are (big) data? What are databases? What are database structures? What can we do with them? This skills training provides an introduction to different database systems and applications, and how to work with them in historical research. The training day offers an introduction to hand-curated data, and the various ways it can be stored: blog entries, text files, presentations, office documents, Wikis, note-taking software, spreadsheets, SQL databases. We will look at what data is, how to gather and encode it, how to link it back to its point of origin, how to normalize it, and what to do if you need more than the software supports. The training’s second day approaches the topic of database structures from the perspective of big data. It provides an overview of current trends in distributed data management. We will have a look at how different data forms (incl. text, XML and JSON) can be handled by open-source libraries and directly be processed in a distributed environment using the Apache Spark platform.