On reading Kanji characters

The way a number is read depends on context and might introduce confusion in the dataset.

I've always wondered about this sentence from How to, too.
The way a kanji is read depends on context! And most kanji have two or more readings on their own.

I'll list as many as I can think of.

A. same meaning / same character / different reading

I think this is what @Adrijaned is concerned about:
as long as there is not multiple ways to pronounce what you have written in the context it is in, it should be fine.

Certainly, the context can narrow down the reading to some extent. But it's a "trend", not an absolute. How a speaker reads depends on their knowledge and lifestyle (e.g. occupation, amount of reading, etc.). Or, more to the point, it can be a matter of "preference". Therefore, when we are asked to read something correctly, we are perplexed. "They are all correct, aren't they?"

The speech algorithm needs to know how to read everything.

B. same meaning / different character / same reading

It is used differently depending on the meaning of each character. Or preference.

C. different meaning / same character / different reading

The reading depends on the context and the word.


Yes, it's impossible to determine how to read in this short context.

D. different meaning / different character / same reading

So-called 同音異義語Dōon-Igi-Go (meaning "homonyms").

Example 1

All Japanese pronunciations can be written in hiragana, but here's why they shouldn't be. Of course, there is a difference in intonation between 書けて and 欠けて. But 記事 and 生地 are the same. If we're trying to figure out the meaning from a hiragana sentence, we're going to need more "background".

Example 2

It's a common pun. Like "Ice Cream" and "I Scream"? It's pronounced a little differently, though.