Orange: Hierarchical Clustering

From OnnoWiki
Jump to navigation Jump to search

Sumber: https://docs.biolab.si//3/visual-programming/widgets/unsupervised/hierarchicalclustering.html


Widget Hierarchical Clustering mengelompokan item menggunakan algoritma hierarchical clustering.

Input

Distances: distance matrix

Output

Selected Data: instances selected from the plot
Data: data with an additional column showing whether an instance is selected

Widget Hierarchical Clustering akan menghitung hierarchical clustering dari objek dengan tipe sembarang dari distance matrix dan menampilkan dendrogram-nya.

HierarchicalClustering-stamped.png
  • The widget supports four ways of measuring distances between clusters:
    • Single linkage computes the distance between the closest elements of the two clusters
    • Average linkage computes the average distance between elements of the two clusters
    • Weighted linkage uses the WPGMA method
    • Complete linkage computes the distance between the clusters’ most distant elements
  • Labels of nodes in the dendrogram can be chosen in the Annotation box.
  • Huge dendrograms can be pruned in the Pruning box by selecting the maximum depth of the dendrogram. This only affects the display, not the actual clustering.
  • The widget offers three different selection methods:
    • Manual (Clicking inside the dendrogram will select a cluster. Multiple clusters can be selected by holding Ctrl/Cmd. Each selected cluster is shown in a different color and is treated as a separate cluster in the output.)
    • Height ratio (Clicking on the bottom or top ruler of the dendrogram places a cutoff line in the graph. Items to the right of the line are selected.)
    • Top N (Selects the number of top nodes.)
  • Use Zoom and scroll to zoom in or out.
  • If the items being clustered are instances, they can be added a cluster index (Append cluster IDs). The ID can appear as an ordinary Attribute, Class attribute or a Meta attribute. In the second case, if the data already has a class attribute, the original class is placed among meta attributes.
  • The data can be automatically output on any change (Auto send is on) or, if the box isn’t ticked, by pushing Send Data.
  • Clicking this button produces an image that can be saved.
  • Produce a report.

Contoh

Workflow di bawah ini menunjukkan output widget Hierarchical Clustering untuk dataset Iris dalam widget Data Table. Kita melihat bahwa jika kita memilih Append cluster IDs dalam widget Hierarchical Clustering, kita bisa melihat kolom tambahan di widget Data Table bernama Cluster. Ini adalah cara untuk memeriksa bagaimana widget Hierarchical Clustering mengelompokkan individual instance.

HierarchicalClustering-Example.png

Dalam contoh di bawah ini, kita load dataset Iris, kali ini kita menambahkan Widget Scatter Plot, menampilkan semua instance dari widget File. Data instance di hitung jarak-nya menggunakan widget Distances yang kemudian dimasukan ke widget Hierarchical Clustering. Keluaran widget Hierarchical Clustering dimasukan ke widget Scatter Plot. Dengan cara ini kita dapat mengamati posisi cluster yang dipilih dalam proyeksi widget Scatter Plot.

HierarchicalClustering-Example2.png

Referensi

Pranala Menarik