Jekyll - How to show “This article can be read by *** minutes” on CJK articles

9 minute read A

Excert of Liquid tags from my current html file

At first, you have to define characters_per_minute and site.locale in a html file or _config.yml. like,

characters_per_minute: 500

or

 <html lang="{{ site.locale | slice: 0,2 | default: "ja" }}">


Also, I am using ui-text.yml to show context of articles correctly depending on site’s defined language. This setting is not absolutely neccesary if you don’t use Minimal-Mistakes as your Jekyll theme.

Just in case, I show you the directory in where I put ui-text.yml.

  • “project_name”/data/ui-text.yml

Okay then, the liquid tags below is one of the conclusion for the problem.

{% assign document = post | default: page %}
// The value of characters_per_minute depends on your language. e.g. Japanese people read 500 chrs per minute on average.
{% assign characters_per_minute = document.characters_per_minute | default: site.characters_per_minute | default: 500  %}
{% assign words = post.content | strip_html | strip_newlines | size %}

{% if words < characters_per_minute %}
// If you don't use ui-text.yml, you can write just like
// less than 1 minute read
  {{ site.data.ui-text[site.locale].less_than | default: "less than" }} 1 {{ site.data.ui-text[site.locale].minute_read | default: "minute read" }}
{% elsif words == characters_per_minute %}
    1 {{ site.data.ui-text[site.locale].minute_read | default: "minute read" }}
{% else %}
  {{ words | divided_by:characters_per_minute }} {{ site.data.ui-text[site.locale].minute_read | default: "minute read" }}
{% endif %}

If you’re new to Jekyll’s liquid tags, I reccomend you to check the site below.

If you define en as your site’s default language and set up ui-text.yml correctly, you’ll see ”** minute read” on your page. In addition, if you wanna show an image of clock before the text, you’ll take a fancy to using Font Awesome.

If you are familiar with Jekyll and some front-end tech., I reccomend you to read this example below. Maybe this source will help your understanding.

Looooong backgrounds of this article

From Jekyll version 4.1.0, which was published on 27 May 2020, new arguments( CJK and auto) for number_of_wirds became usabel. Congrats! You can read description of the options↓.

The difference of CJK and auto is,

Passing ‘auto’ (auto-detect) works similar to ‘cjk’ but is more performant if the filter is used on a variable string that may or may not contain CJK chars.

So, by using auto-option, jekyll can return more accurate word counts in a case of articles which consist of some kinds of languages; e.g. sentences and programs. Taking a close look at the web page which I introduced right now,

{{ "hello world" | number_of_words }}

the count of number of the words is 2 by “hello” and “world”.

{{ "你好hello世界world" | number_of_words }}

The count of number of the words is 1 because there aren no spaces bw words.

{{ "你好hello世界world" | number_of_words: "cjk" }}

The count is 6.

{{ "你好hello世界world" | number_of_words: "auto" }}

The count is 6. Awesome…

Problem I faced

Talking about Minimal-Mistakes which I am using now, estimated reading time(ERT) was calculated like this.

The nuber of words in an article / Words-per-minute = ERT

In this case, words per minute, the number of words human can read within 1 minute, is used. In case of native Japanese speakers, it is said that they can read 500 characters in one minute. On the other hand, it is said that nomral(not trained) native English speakers read 200 words in one minute.

Then you know, for CJK speakers, words-per-minute is definitely meaningless and only characters per minute is neccesary.

So, like the liquid tags above, I changed the equation from the previous one to,

The nuber of words in an article / Characters-per-minute = ERT

Remembering “CJK” and “auto” arguments which I mentioned before,

{{ "你好hello世界world" | number_of_words: "auto" }}
// counts: 6

this method only can count English words by the number of words and count CJK lang. by the number of characters!! so, in case of an article which consists of more than 2 langs. like ENG. and JAN., you can easily imagine its ERT can be extremely wrong value right?

Then,

{% assign characters_per_minute = document.words_per_minute | default: site.words_per_minute | default: 500  %}
{% assign words = post.content | strip_html | strip_newlines | size %}
// The number of words is stored in assign words.

writing this way, we can calculate the ERT by using characters_per_minute. All that remainds is to put the html into yours.

Summary

Someday, perhaps we will be able to show ERT by default Jekyll’s function idk, because Jekyll’s functions are gradually upgraded. Also, how to show CJK’s ERT is not described in any articles maybe so be sure give it a try!!

If you face some errors or you know any ohter methods, let me know that. I can receive any messages in Twitter’s DM.

One Comment. ... For me, one of a Japanese, it’s diff. to look for pages which are relevat to appealing content.