The following is an overview of the schedule when DHRI was hosted in New York City in 2018. Timing and order of workshops in DHRI in June 2020 are subject to change.
As part of our welcome process, we’ll provide some of the history of the project, review the schedule, establish the objectives for the Institute, and ask participants to engage in an introductions and ice breaker activity. Everything we do throughout the institute is collaboratively and community-driven. Therefore, we spend ample time at the beginning getting to know one another, our research interests, and the communities we work in and support.
In this session, we explain the Digital Humanities Research Institute approach to learning computational skills as humanists. We explain why understanding grounding principles in how computers work, in what it means to work from the command line, in learning the principles behind good practices for sharing documents, about how coding works, what databases are, and how coding can be used to search, sort, count, and cluster in ways that can be helpful for humanistic research. There is a lot that personal computers allow us to take for granted when we do our work; however, knowing fundamentals can help humanities scholars become more confident users and critics of digital technologies. Such knowledge leads not only to becoming a better self-teacher, but to more reflective and informed technology choices. It allows us to save time in creating projects when we know what a well-formed dataset should or could look like, when we know what the difference is between using proprietary software rather than open source, and what kind of support might be needed as projects grow. We’ll review the objectives and the schedule for the next 8 days, sort out pedagogical practices, and set our ambitious course for our time together.
The command line is a powerful, text-based way to interact with your computer. You can automate tasks such as creating, copying, and converting files, set up your programming environment, run programs, control other computers remotely, and access programs and utilities that do not have graphical equivalents. In this introduction, we will learn common commands to explore and manipulate a simple data set. By the end of the session, we'll be able to navigate your computer, create and manipulate files, and transform text-based data using only the command line. Stepping away from a point-and-click workflow, we move into an environment where we have more minute control over each task we'd like the computer to perform. In addition to being a useful tool in itself, the command line gives us access to a second set of programs and utilities and is a complement to learning programming.
Git is a tool for managing changes to a set of files. It allows users to recover earlier versions of a project, and collaborate with other contributors. GitHub is a web-based platform that provides access to open source repositories and facilitates collaboration on files, code, or datasets. This session will introduce participants to version control and collaboration using Git and GitHub, and demonstrate their use in digital projects.
Python is a general-purpose programming language that is suitable for a wide variety of core tasks in the digital humanities. Learning Python fundamentals is a gateway to analyzing data, creating visualizations, composing interactive websites, scraping the internet, and engaging in distant reading of texts. This session in Python fundamentals also introduces essential computing concepts such as data types, iteration, input/output, control structures, and importing libraries. This session will serve as a basis for later sessions in databases, natural language processing, and working with APIs.
Databases are invaluable tools for organization, and are better than a spreadsheet for working with multiple data sets, asking questions, and adding structure to your data. This workshop will introduce you to the basics of interacting with databases using Python, and will include hands-on practice creating databases and tables, importing data, and querying the database. We will also discuss cleaning data, and what a good data set might look like.
Digital technologies have made vast amounts of text available to researchers, and this same technological moment has provided us with the capacity to analyze that text. The first step in that analysis is to transform texts designed for human consumption into a form a computer can analyze as well. Using Python and the Natural Langauge ToolKit package (commonly called NLTK), this workshop introduces strategies to turn qualitative texts into quantitative objects. Though that process, we will present a variety of strategies for simple analysis of text-based data.
This session will introduce attendees to two mark-up languages: HTML and CSS. These are two of the most commonly-used languages in rendering information on the web today. In addition to learning the basics and basic differences of each language, attendees will use a simple text editor and their local computer to begin creating a web site. Beyond learning HTML and CSS, users will leave the workshop with a clearer understanding of how the internet works. This workshop is geared towards beginners – no prior experience with either language or website-building is necessary.
This session will introduce participants to the core concepts of supervised and unsupervised machine learning. We will first do a hands-on example of supervised machine learning with a text classification example, to discuss topics such as exploratory data visualization, data preprocessing, feature representation, and training and testing a machine learning algorithm. We will then work through an unsupervised learning task where we will look for groups in our data using topic modeling. For this session, we will be using the Pandas data analysis library, the scikit learn machine learning library, and the Matplotlib visualization library. This session is aimed towards researchers who want to find patterns in their data or use their data to predict a phenomena.
This discussion-based workshop will address an array of ethical questions and concerns for folks doing digital projects or research with an emphasis on consent, personhood, confidentiality, political economy, the politics of knowledge production, and accessibility. In addressing these issues, this workshop will first provide a general overview of ethics for institutional research compliance - including the Belmont Report and Institutional Review Board - and then delve into an array of ethical issues that extend beyond institutional purview.
The approach of this workshop is premised on the understanding that there is no simple roadmap for practicing 'good ethics' and, indeed, what constitutes 'good' or 'ethical' for one individual may vary from the next and is often reflective of a scholar's political commitments and personal background. Nonetheless, this workshop will foreground key ethical questions to ask (and keep asking!) when designing and doing digital projects or digital research, and key concepts to draw upon when thinking through these questions.
Moderator: Kelsey Chatlosh
Panelists: Kelly Baker Josephs, Shana Kimball, Luke Waltzer, and Patrick Sweeney
A discussion of digital ethics with an emphasis on social justice, transparency, and accessibility. Humanists will cover topics such as the human impact of big data analysis, sharing personal information, developing tools that capture personal data. Panelists will address intersections between recent events, data use, and the role panelists see humanists playing in current debates about data, technology, and research. Each panelist will have 10 minutes of prepared remarks, leaving plenty of time for discussion.
This workshop will offer an approachable introduction to Geographic Information Systems (GIS), a digital tool that allows to create maps and analyze data in a geospatial context. GIS software can be intimidating, but with this workshop, participants will be able to understand the interface of QGIS, an open-source and versatile GIS solution, and use it for basic yet practical operations. First, we will look at the basic terminlogy of GIS. Then, we will do a step-by-step practice scenario that will allow to learn many of the tools available in QGIS. By the end of this workshop, participants will be able to use QGIS to: add and create vector and raster layers; view and edit fields and features in the attributes table; perform basic geoprocessing operations on vector layers and create basic visualizations to facilitate geospatial data analysis.
APIs (Application Programming Interfaces) are a structured way for programs to communicate with other programs. A knowledge of APIs allows your programs to communicate with major services such as The New York Times and Twitter and collect data from organizations such as the Library of Congress. In this session, we'll discuss API fundamentals while using the Twitter API to create a Twitterbot—an automated account. We'll also discuss the ethical use of APIs and how tools such as APIs have shaped the modern internet.
Conversation after the weekend break begins with thinking about how the workshops from the previous week might work at each participant’s own institution. How do you lead something like the DHRI at your own organization? We’ll start by reviewing the guiding principles of DHRI and how those principles intersect with the needs of a variety of organizations. We will break DHRI into composite parts: People, Context, Logistics, Curriculum, Outreach, Resources. Working independently and in groups to review the list of resources each participant assembled before coming to the DHRI, participants will have a basic outline for a proposal for their DHRI.
What separates projects that turn into something from those that stall out and go nowhere is the formulation (and constant revision and adaptation) of a reasonable, informed, and purposeful project plan. During this session, participants will be introduced to sound project development and management practices. Starting with an end goal in mind, an articulation of the needs and opportunities, audience, resources, and work plan, participants will draft a one to three page proposal for their DH projects.
Two sessions will provide participants with time to work independently or--if desired--in groups on several possible tasks. Instructors from the previous week will be on hand to answer questions and provide feedback. Possible projects include:
Moderator: Matt Gold
Panelists: Nicky Agate, Jill Cirasella, Patricia Hswe, and Julia Miele Rodas
Invited guest speakers will introduce perspectives on what “open” means as a humanistic ethic within the context of current events. What opportunities and challenges does “open” present to the humanities scholar? What is our responsibility to creating access to knowledge production, and how does that intersect with the needs for privacy? What are the requirements for “open” according to funders, and how does the word “accessibility” take on new valence when considered in the context of ability / disability? What is “universal design”? Following brief introductory remarks, the session will be largely discussion-driven.
During this session, participants will learn how to communicate with the DHRI team throughout the academic year, how to draw on their 20 hours of continued support, how to use the Commons forums to support one another, and how to report back on their progress. Participants will revise their own work plans, integrating deadlines for DHRI-specific requirements (deadlines for paperwork for stipends, travel plans for June 2019). Topics will also include tips for publicity and attribution (logo use, Twitter hashtags, and sharing any articles or news features about their work). The white paper requirements for June 2019 will be introduced, as well as presentation requirements. Discussion will also include the agenda for next June.
In this discussion-based and reflective session, instructors and participants will discuss what the pedagogical choices are for running each of the DHRI sessions. What choices do we make about how to present material? How do these translate to participants’ own communities? What improvements could be made, and how might those improvements be translated back into the design of the workshops? How do we connect DH projects and humanities questions to skills? What might be effective methods for doing the same within your own community? What is an inclusive-pedagogy in a skills-based learning setting, and how can those practices be shared with collaborators? What methods of evaluation can be used to help iterate and improve pedagogical practice?
In this session, participants will discuss and practice how to build an open network of community-driven curricula. Participants will learn to fork the DHRI curricula. Participants will learn how to open issues in order to suggest changes or to alert the DHRI team to problems in the curriculum. They will also learn how to initiate pull requests with updates or changes so that they can become participants in the open creation and sustainability of DHRI content.
Each participant will present a three-minute proposal for a DHRI. Presentations will consider: the needs and opportunities of their organization, the likely participants, the format, the challenges, potential partners, the institute’s goals, and the outstanding questions to resolve when they return.