Automatic hyphenation depends on the defined document language
- Published at
- Updated at
- Reading time
- 3min
The lang
HTML attribute (e.g. lang="en"
) is essential for creating accessible websites. Without it, browsers and assistive technology can only guess the website's language leading to a poor user experience.
And while I was aware that the attribute is important, I didn't know that it also affects how browsers deal with long words and hyphenation.
If you're displaying text with long words in narrow containers, you might run into overflow situations. Long words will break out of their containers.
If you look at the overflowing words and the container width, the only solution is to break the words into pieces and add hyphens.
The hyphens
CSS property can help here.
The default value of the hyphens
property is manual
. With manual hyphenation, you're in charge of defining when and how a word should be divided. Use a visible hyphen (U+2010
/ ‐
(HYPHEN)) or an invisible "soft" hyphen (U+00AD
/ ­
(SHY)) character to define the hyphenation breakpoints.
Both character cues are used to break words apart. Unfortunately, the manual
way is not scalable.
Think of a site that includes hundreds of pages maintained by various people. Hyphens then need to be rendered in different word locations depending on the responsive layout. A long word on a mobile device might need hyphenation, whereas the same word might be acceptable in a large-screen layout. An all-time-visible hyphen won't do it!
And adding an invisible character to break long words properly... well... you can't expect writers and editors to fiddle around with invisible HTML-encoded characters. That's not going to work either.
<!-- That's too complicated 👇 -->
<div>un­imaginative­ly</div>
Another approach is to use hyphens: auto
. With this CSS declaration, you're throwing the burden of hyphenating words to the browser side. MDN documents the auto
value as follows:
The browser is free to automatically break words at appropriate hyphenation points, following whatever rules it chooses.
After playing around with text containing long words, I learned that the document language plays a role in how browsers hyphenate words.
Have a look at the demo below to see how the lang
attribute affects automatic hyphenation.
I'm not 100% sure how to correctly hyphenate English words, but I guess that the hyphenation is better with the correct language.
And there you have, the lang
attribute does not only make sites more accessible but also affects how long words are divided. We should better make sure it's defined. 🙈
If you want to read more about hyphenation and the lang
attribute, have a look at these two excellent articles:
Join 5.5k readers and learn something new every week with Web Weekly.