Page 78 - E-Book Kecerdasan Buatan Dandung PTI 1A
P. 78

8.2 Proses Natural Language Processing

                    Tahapan-tahapan dalam pengolahan teks seperti pada Gambar 8.1.

























                                            Gambar 8.1 Proses Pengolahan Teks
                       Berdasarakn Gambar 8.1 tahapan pertama adalah preprocessing.  Preprocessing  digunakan
                    untuk  membersihkan  sebuah  teks  seperti  menghilangkan  symbol,  tanda  baca, kata  sambung,

                    dll.








                         Tokenization: Text is tokenized into tokens such as words
                         Lemmatization: Word is lemmatized into its lemma form
                         Morphological analyzer: word is analyzed into its root word and its affixes

                         Stemming: Word is stemmed into its stemmed form
                         Lowercase: all words are lowercased

                                                                                                    75
   73   74   75   76   77   78   79   80   81   82   83