Difference between revisions of "Orange: Import Documents"
Onnowpurbo (talk | contribs) (→Contoh) |
Onnowpurbo (talk | contribs) (→Contoh) |
||
Line 29: | Line 29: | ||
[[File:Import-Documents-Example1.png|center|200px|thumb]] | [[File:Import-Documents-Example1.png|center|200px|thumb]] | ||
− | Now let us try it with subfolders. We have placed Kennedy’s speeches in two folders - pre-1962 and post-1962. If I load the parent folder, these two subfolders will be used as class labels. Check the output of the widget in a Data Table. | + | Selanjutnya, mari kita mencoba Now let us try it with subfolders. We have placed Kennedy’s speeches in two folders - pre-1962 and post-1962. If I load the parent folder, these two subfolders will be used as class labels. Check the output of the widget in a Data Table. |
[[File:Import-Documents-Example2.png|center|200px|thumb]] | [[File:Import-Documents-Example2.png|center|200px|thumb]] |
Revision as of 08:57, 15 March 2020
Sumber: https://orange3-text.readthedocs.io/en/latest/widgets/importdocuments.html
Import text document dari folder.
Input
None
Output
Corpus: A collection of documents from the local machine.
Widget Import Documents mengambil file text dari folder dan membuat sebuah corpus. Widget Import Documents dapat membaca .txt, .docx, .odt, .pdf dan .xml. Jika dalam folder ada subfolder, itu dapat digunakan untuk me-label class.
- Folder being loaded.
- Load folder from a local machine.
- Reload the data.
- Number of documents retrieved.
Jika widget Import Documents karena satu dan lain hal tidak berhasil membaca file tertentu, maka file tersebut akan di loncat / skipped. File yang berhasil di baca akan di kirim ke output.
Contoh
Di Widget Import Documents, untuk mengambil data, pilih (select) folder icon di sebelah kanan dari widget. Pilih (select) folder yang kita ingin jadikan corpus. Setelah loading selesai, kita akan melihat berapa dokumen yang berhasil di ambil oleh Widget. Untuk mengamati corpus yang di peroleh, sambungkan widget Import Documents ke Widget Corpus Viewer. Kita menggunakan sekumpulan pidato Presiden Kennedy dalam format text biasa.
Selanjutnya, mari kita mencoba Now let us try it with subfolders. We have placed Kennedy’s speeches in two folders - pre-1962 and post-1962. If I load the parent folder, these two subfolders will be used as class labels. Check the output of the widget in a Data Table.