Increasing Access to Knowledge through TEI encoding

September 11, 2020

Heewon Park

This is the seventh in a series of blog posts by participants in the 2019 ACLS Digital Extension Grant project “Ticha: advancing community-engaged digital scholarship” (PI Lillehaugen) published on GlobalSL / Community-based Global Learning Collaborative and Ticha. Previous blog posts are available here: Lillehaugen/January 2020; Flores-Marical/February 2020; Kawan-Hemler/March 2020; Lopez/July 2020; Kadlecek/1 August 2020; García Guzmán/15 August 2020.

The past three months have been novel and unexpected, to say the least. In the earlier months of 2020, I was still attending in-person classes at Swarthmore College, finishing up my junior year as a political science major. I remember the morning that Brook (a Ticha team member and my internship supervisor) called me to let me know that I was accepted as a summer intern. I was sitting in the dining hall with my friends, raving with excitement for how incredible this opportunity would be — traveling to Oaxaca, working with local communities, being able to practice my newly learned Spanish. As a low-income student and immigrant, I did not have the chance to travel for most of my childhood, due to either visa issues or financial burdens, and I was immensely excited by the chance to work with Zapotec communities in both Los Angeles and Oaxaca.

Of course, with the onset of coronavirus and the intensification of the pandemic, my initial hopes for the summer never manifested into reality; I would be lying if I said that I wasn’t disappointed. Nonetheless, my work and my team have allowed me to find meaning and enrichment in the past few months, despite the remote landscape of it all. My specific job was to increase our team’s content accessibility, which translated into tangible tasks like working to improve the website’s graphic design and creating a normalization of Early Modern Spanish texts into a more modern, accessible version of the language[1]. In order to create this aspect of the digital editions, several other team members and I attended a two-part Zoom training on TEI encoding, during which we learned how to normalize texts by replacing older forms Spanish words with modern equivalents that Spanish readers today can better understand. To outsiders, I realize this might seem like a menial task, particularly since many older forms of Spanish do resemble their modern counterparts in a fairly intuitive manner; for instance, it is not surprising that the older spelling of mysterio holds the same meaning as its modern version misterio. However, the same is not always true for words like aſcenſion (ascención) or ſu (su), and the links between such words may be even less intuitive if the reader is not a native Spanish speaker. For a helpful comparison, consider Shakespeare: as modern English speakers, it is possible for us to read his works and gain a functional degree of comprehension of the narrative; however, it is by no means easy, and often we can miss important facets of the plot or message due to linguistic changes we are unfamiliar with, i.e. syntax that confuses our modern understanding of English, and words that mean something different today than they did in the 16th century. This is why so many editions of Shakespeare’s plays are “translated” to some degree, in order to provide enough language normalization such that an average English speaker today can enjoy his works without having to first study Early Modern English as a separate language. Similarly, without the Spanish normalization work, the Zapotec texts and analysis that the Ticha team had access to would have remained restricted to narrow realms of academia and experts of Colonial Valley Zapotec. In fact, this directly contributed to one of Ticha’s primary goals — making historical and linguistic knowledge more egalitarian through the creation of resources that are accessible to both non-academics and non-specialists.

Fig. 1. An example of the TEI encoding allowing us to change older forms of Spanish words into their modern counterparts.

There were certainly times when the TEI encoding felt tedious, when the Github network was frustrating, and when I worried that the normalizations might not even be noticed. However, the reality is that increasing accessibility is not a simplistic process, nor is it about attaining recognition or appraisal; it’s about providing respect, dignity, and acknowledgment to a body of work. Many children, students, and adults who have grown up in Zapotec communities have not had the opportunity to study the corpus of texts written in Zapotec— or at times, to even know there is such a corpus— and so frequently, accessibility is a quality absent from academic work on indigenous languages.

Fig. 2. Here you can see an example of the resulting digital edition of one of the texts our team was working on. The Modern Spanish tab on the Ticha website normalizes older spellings like “excelentiſisimo” or “oy” into their modern counterparts of “excelentísimo” and “hoy”.
Fig. 3. Here is another example, where we see Spanish and Zapotec translations alongside one another.

Through the process of collaborating with the other interns and Zapotec advisory board, I have realized that our normalization work, albeit a small portion of Ticha’s larger pedagogical aims, is a significant facet in providing equal access to knowledge and extending the public reach of our team’s work. Throughout the summer, I have often reflected on what it means to utilize my knowledge in an equitable and meaningful way. As the recipient of an elite education like that of Swarthmore College, I have had access to incredible tools and learning resources that few are fortunate enough to receive; these are the catalysts that have enabled me to learn Spanish in an academic context, and develop the capacity to understand Spanish texts dating back to 500 years ago. But of course it is important to remember that something that has become easy for me, like reading Early Modern Spanish texts, is not necessarily so for those who have had different educational contexts. This should be reason enough for all scholars to prioritize not only exploration of knowledge, but also accessibility of scholarship; however, too often in academia, we misuse our privileged positions to obscure acquired knowledge, utilizing our status to produce pedantic work that is intelligible only to insular circles of elite scholars.

I admire the Ticha team for many reasons, but particularly for centering accessibility as a primary goal deserving of time and attention. I am proud to have been able to apply my education here to break down at least certain barriers to knowledge access, and I am grateful for having had the opportunity to do so with such a dedicated team that is so devoted to dialogue and community collaboration. I hope that even upon graduating and entering the professional realm, I will be able to have professional experiences, whether they be related to linguistics or not, that are as community-oriented and engaged as the work of the Ticha project.

[1] Lillehaugen, Brook Danielle, Claire Benham, Janet Chavéz Santiago, Emily Drummond, James Arthur Faville, Eloise Kadlecek, Collin Kawan-Hemler, Avery A. King, Bridget Murray, Tomas Paris, Heewon Park, Tristan Jacobo Pepin, May Helena Plumb, Mindy Renee Reutter, James E. Truitt, Christina Nicole Ulowetz, Mike Zarafonetis & Ian Fisher. 2020. Digital edition of Fray Leonardo Levanto’s 1766 Cathecismo de la lengua Zaapoteca, first edition. Online:

