We’re back from another fleeting summer, and it’s time to start work on Seeing Syllabi in earnest. Not that summer was “off”, exactly, or that this is our first pass at the project. In this post, I’ll describe what we’ve done so far and where we’re going with all this.
Pam and I (Pamella Lach, the extraordinarily competent and imaginative Digital Innovation Lab manager) have been meeting since January 2013 to scale the project, generate ideas, work out a budget, and do what we like to call “develop a working relationship” (that is to say: trade stories about our families, unruly dogs, graduate studies, head colds, and strange dreams, and laugh a lot). In the process, we’ve identified a semester-size chunk of the original Seeing Syllabi proposal, then called IVI (see the IVI project page for details on the original IVI plan and how our revised approach has developed) that feels both true to the enormous concept that we started with and valuable to a variety of potential users. The rest of the fellowship semester will be devoted to building a prototype of this core function. There are four main things that we’ll need to make this happen. I’ll describe each and say a few things about where each element stands for now.
- A significant corpus of PDF-format syllabi. This is the valuable content at the very center of the project; we’re nowhere without it. We thought we’d have to develop the database ourselves and were bracing for the task of soliciting donations from various institutions, scraping the web (with inspiration from Dan Cohen’s earlier Syllabus Finder project), and begging UNC’s Sakai people for data dumps. Then, while presenting on a Digital Shorts panel at the American Studies Association conference in fall 2012, I encountered Alex Gil and Dennis Tenen of the Open Syllabus Project, who are working on compiling just such a corpus, along with a back-end architecture to make the documents accessible to web apps like ours and other analytical approaches. My friend Steve Brauer of St. John Fisher College–one of the collaborators on the original large-scale IVI proposal–and I set up a video conference with them, and the relationship has developed from there. The OSP shares our commitment to openness, cooperation, inclusiveness, sharing, and collaboration, and we’re organizing our efforts as best we can.
- Algorithms to extract metadata from the corpus of syllabi. If you’re familiar with the language of data mining, you know that PDFs and other primarily textual documents are what’s called “unstructured data”–they aren’t arranged into tagged and labelled tables and containers that make it easy for computers to search, organize, and analyze the content. We have to write programs that are able to accept new PDFs and automatically scan them, identify the information we need–author names, text titles, dates, institutional affiliation, discipline, and so on–and tag those documents with that information, the metadata. Only then will we–or anyone–be able to search and visualize patterns within the documents. Since the OSP is focusing on developing the corpus itself and its back-end architecture, we are working on this piece. We’ve hired the brilliant John D. Martin III, a former system administrator and web developer and current PhD student at UNC-Chapel Hill’s School of Information and Library Science, to help us.
- Visualization algorithms. Once the metadata are in place, users will be able to search and run analytics on the dataset by typical syllabus elements like author names, titles, dates, institutions, disciplines, and more. Imagine a phrase net like this, in which each node, or word, in the net represents a syllabus, color-coded by discipline; clicking on a node will link you to that document. Or a network diagram, in which items are clustered based on similar elements between documents. I’m also inspired by projects like the Stanford Dissertation Browser, which lets you sort and explore dissertations by discipline in surprising ways. It’s also very pretty. Highlighting interesting interdisciplinary connections between courses and course materials is one of my pet interests in this project.
- A web-based front end. Quite simply, we need a well-designed web app that will accept user queries, send them to the database, and return search results and/or visualizations.
So that’s a lot of moving pieces.
I’ve also been doing a lot of thinking about “digital humanities”–what it means, what it is, what relationship it has to the “humanities computing” of yore. Digital humanities is an odd area–not because we haven’t yet managed to define it in a way that feels satisfying to everyone (I appreciate this uncertainty and, frankly, hope it continues to generate questioning and flux), but because at root, we’re dealing with a population of people working with media that they really don’t understand. Advanced graphical user interfaces mean that humanists are able to work in rather sophisticated ways with digital formats, tools, and media without understanding much, or anything, about whatever is going on behind the scenes. Learning to use some of these tools, even through a GUI, can be, as they say, non-trivial. Yes, digital tools and analytics can inspire unexpected and exciting ways of thinking about and interacting with texts and data in anyone, no matter her technical background or savvy. However, I’m interested in exploring the edge between humanists’ (or, really, anyone’s) developing proficiency as users of these tools and many users’ lack of understanding of the nature of the tools themselves or how they operate. What does it mean to work closely with a tool we understand so little about? How does that shape the thinking we do through and around that tool?
To this end, I’m always working on increasing my own technological literacy. This spring and summer I participated in a challenging 12-week web development bootcamp with Bloc; it did not turn me into a professional developer, but I learned a lot. This semester I’m joining a data mining class at SILS with the grad student I mentioned, John. We’ll work together to better understand the problem we’re trying to solve and how best to approach it.
All the time I’ve spent thinking about how to analyze syllabi–what people might want to know about them, what they can tell us about how we teach and learn–has grown into a deep curiosity about the syllabus form itself: its history, what it was/is/should be, what kind of promises it makes, how it makes them, how it’s changing, and more. We may slowly be moving away from the traditional syllabus format in favor of web sites and fillable fields on course management systems, so it’s also possible that the corpus we’re producing will someday constitute an important historical record. As the fellowship semester progresses, I’ll be exploring and working to better understand both this project’s technical languages and its content: the material and conceptual form of the syllabus itself.
Finally, yesterday was the first meeting of the Institute for the Arts and Humanities Faculty Fellows. The group will gather for breakfast, presentations, and conversation each Wednesday morning. The three-hour meeting flew by: I’m honored and a little stunned to be in such accomplished company and am reminded once again of the great good fortune of this fellowship semester. More detail on our sessions together will surely follow, but I wanted to end this inaugural post with a note of thanks to the IAH and the DIL for the gifts of their time and support.