Difference between revisions of "Orange: Data Mining for Business and Public Administration"

From OnnoWiki
Jump to navigation Jump to search
(Created page with "Sumber: https://orange.biolab.si/blog/2017/11/17/data-mining-business-public-administration/ We’ve been having a blast with recent Orange workshops. While Blaž was getti...")
 
 
(2 intermediate revisions by the same user not shown)
Line 3: Line 3:
  
  
We’ve been having a blast with recent Orange workshops. While Blaž was getting tanned in India, Anže and I went to the charming Liverpool to hold a session for business school professors on how to teach business with Orange.
+
Ketika kami mengatakan ajarkan bisnis, kami maksudkan bagaimana melakukan data mining untuk bisnis, katakan prediksi churn atau gesekan karyawan, segmen pelanggan, temukan item mana yang direkomendasikan di toko online dan lacak sentimen merek dengan text analisis.
  
Obviously, when we say teach business, we mean how to do data mining for business, say predict churn or employee attrition, segment customers, find which items to recommend in an online store and track brand sentiment with text analysis.
+
Untuk tujuan ini, kami telah membuat beberapa pembaruan pada add-on Associate kami dan menambahkan dataset baru ke Data Sets widget yang dapat digunakan untuk segmentasi pelanggan dan menemukan grup item mana yang sering dibeli bersama. Seperti ini:
 
 
For this purpose, we have made some updates to our Associate add-on and added a new data set to Data Sets widget which can be used for customer segmentation and discovering which item groups are frequently bought together. Like this:
 
  
 
[[File:Screen-Shot-2017-11-17-at-13.06.22.png|center|300px|thumb]]
 
[[File:Screen-Shot-2017-11-17-at-13.06.22.png|center|300px|thumb]]
  
We load the Online Retail data set.
+
Kita load Online Retail data set
  
 
[[File:Screen-Shot-2017-11-17-at-13.07.31.png|center|300px|thumb]]
 
[[File:Screen-Shot-2017-11-17-at-13.07.31.png|center|300px|thumb]]
  
 +
Karena kita memiliki transaksi di baris dan item di kolom, kami harus mengubah tabel data untuk menghitung jarak antara item (baris). Kita cukup meminta Distances widget  untuk menghitung jarak antara kolom dan bukan baris. Kemudian kita mengirim tabel data yang dipindahkan ke Distances dan menghitung jarak cosinus antara item (jarak cosinus hanya akan memberi tahu kita, item mana yang dibeli bersamaan, mengabaikan jumlah item yang dibeli).
  
Since we have transactions in rows and items in columns, we have to transpose the data table in order to compute distances between items (rows). We could also simply ask Distances widget to compute distances between columns instead of rows. Then we send the transposed data table to Distances and compute cosine distance between items (cosine distance will only tell us, which items are purchased together, disregarding the amount of items purchased).
+
[[File:Screen-Shot-2017-11-17-at-13.10.24.png|center|300px|thumb]]
 
 
[[File:Screen-Shot-2017-11-17-at-13.10.24.png|center|300px|thumb
 
 
 
  
Finally, we observe the discovered clusters in Hierarchical Clustering. Seems like mugs and decorative signs are frequently bought together. Why so? Select the group in Hierarchical Clustering and observe the cluster in a Data Table. Consider this an exercise in data exploration. :)
+
Akhirnya, kita mengamati cluster yang ditemukan di Hierarchical Clustering. Sepertinya mug dan tanda dekoratif sering dibeli bersamaan. Kenapa begitu? Pilih grup di Hierarchical Clustering dan amati cluster di  Data Table. Anggap ini sebagai latihan dalam eksplorasi data. :)
  
 
[[File:Screen-Shot-2017-11-17-at-13.04.32.png|center|300px|thumb]]
 
[[File:Screen-Shot-2017-11-17-at-13.04.32.png|center|300px|thumb]]
 
 
The second workshop was our standard Introduction to Data Mining for Ministry of Public Affairs.
 
 
This group, similar to the one from India, was a pack of curious individuals who asked many interesting questions and were not shy to challenge us. How does a Tree know which attribute to split by? Is Tree better than Naive Bayes? Or is perhaps Logistic Regression better? How do we know which model works best? And finally, what is the mean of sauerkraut and beans? It has to be jota!
 
 
Workshops are always fun, when you have a curious set of individuals who demand answers! :)
 
  
  

Latest revision as of 12:29, 4 February 2020

Sumber: https://orange.biolab.si/blog/2017/11/17/data-mining-business-public-administration/


Ketika kami mengatakan ajarkan bisnis, kami maksudkan bagaimana melakukan data mining untuk bisnis, katakan prediksi churn atau gesekan karyawan, segmen pelanggan, temukan item mana yang direkomendasikan di toko online dan lacak sentimen merek dengan text analisis.

Untuk tujuan ini, kami telah membuat beberapa pembaruan pada add-on Associate kami dan menambahkan dataset baru ke Data Sets widget yang dapat digunakan untuk segmentasi pelanggan dan menemukan grup item mana yang sering dibeli bersama. Seperti ini:

Screen-Shot-2017-11-17-at-13.06.22.png

Kita load Online Retail data set

Screen-Shot-2017-11-17-at-13.07.31.png

Karena kita memiliki transaksi di baris dan item di kolom, kami harus mengubah tabel data untuk menghitung jarak antara item (baris). Kita cukup meminta Distances widget untuk menghitung jarak antara kolom dan bukan baris. Kemudian kita mengirim tabel data yang dipindahkan ke Distances dan menghitung jarak cosinus antara item (jarak cosinus hanya akan memberi tahu kita, item mana yang dibeli bersamaan, mengabaikan jumlah item yang dibeli).

Screen-Shot-2017-11-17-at-13.10.24.png

Akhirnya, kita mengamati cluster yang ditemukan di Hierarchical Clustering. Sepertinya mug dan tanda dekoratif sering dibeli bersamaan. Kenapa begitu? Pilih grup di Hierarchical Clustering dan amati cluster di Data Table. Anggap ini sebagai latihan dalam eksplorasi data. :)

Screen-Shot-2017-11-17-at-13.04.32.png




Referensi

Pranala Menarik