Font Selection and Linguistic Display Improvements in eHRAF

While English makes limited use of diacritical marks, a number of the world’s languages employ the Latin alphabet augmented with such marks to represent specific phonetic distinctions. Familiar examples may include the tilde in ñ or accent marks on vowels such as á or the diaresis/umlaut of ö; however, the full range of possible diacritic combinations and letters is considerably broader. Linguists, in particular, often combine the International Phonetic Alphabet (IPA) with other diacritic-based transcription schemes to document dialectal variation, transcribe languages without indigenous writing systems, and transcribe text from non-Latin scripts. The last was often done because there were practical limitations on earlier publications since printing presses usually did not allow for the simultaneous use of multiple alphabets. Nowadays we all enjoy expanded typographic capabilities with modern font families offering varying degrees of support for different language families; some fonts are more optimal for African languages, for example Pangea Afrikan, while others are more suited for North American languages, like Skeena Indigenous.

At HRAF, we have documents that span the globe and include the wide breadth of orthographic possibilities. Choosing a font is thus akin to the tale of Goldilocks. However, following careful evaluation, Noto Sans has been adopted as the new default typeface for eHRAF. Designed to offer the broadest Unicode support, Noto provides consistent spacing and reliable rendering for many of the orthographies encountered across the database. It is also well-suited to accessibility requirements, offering compatibility with screen readers and improved readability for users with dyslexia.

To address cases where Noto Sans may not render certain diacritic combinations optimally, Gentium Plus has been introduced as an alternative option. Gentium is particularly recommended for linguistic texts and for materials documenting languages that lack an indigenous written tradition. Developed by and for linguists, Gentium excels in the precise display of unusual and complex diacritic combinations, ensuring accurate representation of forms, for example:

From SH04-Ona, a culture with no indigenous writing system
From ES10-Scots, precisely conveying pronunciation

 

Here’s a side-by-side of these 2 new fonts:

Noto Sans (NT26-Jicarilla Apache)

 

Gentium (NT26-Jicarilla Apache)

 

Changing the fonts in eHRAF can be done by using the gear icon in the top banner of the website (between your log in and the search magnifying glass). This settings panel contains several ways to personalize your eHRAF experience including the fonts used for display. Check them out and see which you may prefer!

Access eHRAF settings at the top right of the banner.

 

Choose your font.

 

These font updates are part of a broader initiative to improve data quality and expand character display support across both new and existing eHRAF documents. We are continually going through older, already published documents to correct previously unrendered text and any new additions to the database will now have complete Unicode support for the IPA, UPA, and APA phonetic systems. In addition to phonetic transcription systems, HRAF now endeavors to render indigenous writing systems in newly published documents, where feasible.

Is there a phonetic transcription system or indigenous writing system you would like to see support for in eHRAF? Please tell us at hraf.support@yale.edu

Ben Kluga & Teresa Silva