Use "MeCab", the Japanese morphological analyzer, via Web service !!!
[SENTENCE] ==> MeCab Web Service ==> [Results of Morphological Analysis in XML]
http://www.tahoo.org/~taku/software/mecapi/mecapi.cgi
| Parameter | Value | Description |
|---|---|---|
| sentence | string (required) | The sentence to be analyzed (Japanse, UTF-8) |
| response | surface, feature, pos, inflection, baseform, pronounciation |
Controls the data returned by the operation. surface : Surface string of words. feature : Various information. Contains pos, inflection, baseform and pronounciation pos : Part-Of-Speech of the word. inflection : Type and form of inflection. baseform : Baseform of the word. pronounciation : Pronounciation of the word. Default: "surface,feature" |
| filter | noun, uniq |
Filters the words in the result of MeCab by the operation. noun : ignores the words whose part-of-speech is not noun. uniq : removes duplicate words and count them. Default: "" |
| format | xml, json | Specifies the output format. Default: "xml" |
| callback | string | The name of the callback function to wrap around the JSON data. If format=json has not been requested, this parameter is ignored. |
Sample Request Url:
| Field | Description |
|---|---|
| MecabResult | Contains all of the results. |
| word | Contains each individual word. |
| surface | Surface string of a word. |
| feature | Contains pos, inflection, baseform and pronounciation. |
| pos | Part-Of-Speech of the word. |
| inflection | Type and form of inflection. |
| baseform | Baseform of the word. |
| pronounciation | Pronounciation of the word. |
| count | The frequency of words in the result. (for filter=uniq) |
You can get SOURCE CODE FROM HERE.