Morphology
As described in Introduction, the parts-of-speech of Japanese are defined as a mapping from UniDic POS tags of SUW (short unit word), LUW (long unit word) and their syntactic positions, because the UniDic guideline is fully established and widely used in Japanese NLP. The rule is defined cabocha2ud/conf/bccwj_pos_suw_rule.yaml
The following table defines a mapping from UniDic SUW POS tags into Universal Dependencies POS tags (this table is not finalized yet; any suggestions are welcome).
UD POS | UniDic SUW POS |
---|---|
ADJ | 形容詞(adjective), 連体詞(adnomial), 形状詞(adjectival noun), 名詞-普通名詞-形状詞可能, 名詞-普通名詞-サ変形状詞可能 |
ADV | 副詞(adverb), 名詞-普通名詞-副詞可能 |
INTJ | 感動詞(interjection) |
NOUN | 名詞-普通名詞(common noun), 接頭辞(prefix), 接尾辞(suffix), 名詞-普通名詞-サ変形状詞可能, 名詞-普通名詞-副詞可能, 名詞-普通名詞-助数詞可能, 形状詞-助動詞語幹, 記号, 外国語, 接尾辞 |
PROPN | 名詞-固有名詞(proper noun), 外国語 |
VERB | 動詞(verb), 名詞-普通名詞-サ変可能, 名詞-普通名詞-サ変形状詞可能 |
ADP | 助詞-格助詞(case particle), 助詞-係助詞(binding particle), 助詞-副助詞 |
AUX | 助動詞(auxiliary verb),動詞-非自立可能(する,できる,くださる,いただく,いたす,なさる),形状詞-助動詞語幹(そう,よう),名詞-助動詞語幹 |
CCONJ | 接続詞(conjunction), 助詞-格助詞(case particle) |
DET | 連体詞(adnomial)(こそあど) |
NUM | 名詞-数詞(numeral noun), 名詞-普通名詞-助数詞可能 |
PART | 助詞-副助詞(adverbial particle), 助詞-終助詞(phrase final particle), 助動詞, 接尾辞-形容詞的, 接尾辞-名詞的, 接尾辞-動詞的 |
PRON | 代名詞(pronoun) |
SCONJ | 助詞-接続助詞(conjunctive particle), 助詞-準体助詞(nominal particle) |
PUNCT | 補助記号(supplementary symbol) |
SYM | 記号(symbol), 補助記号(supplementary symbol) |
X | 空白(white space) |
Several UniDic POS tags of SUW are mapped into different UD POS tags depending on additional information like lemmas, LUW POS tags and/or syntactic context (HEAD or not).