Difference between revisions of "Orange: Outliers"

From OnnoWiki
Jump to navigation Jump to search
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
Sumber: https://docs.biolab.si//3/visual-programming/widgets/data/outliers.html
 
Sumber: https://docs.biolab.si//3/visual-programming/widgets/data/outliers.html
  
 +
Widget [[Outliers]] mendeteksi outlier sederhana dengan membandingkan jarak antar instance.
  
Simple outlier detection by comparing distances between instances.
+
==Input==
  
Inputs
+
Data: input dataset
 +
Distances: distance matrix
  
    Data: input dataset
+
==Output==
  
    Distances: distance matrix
+
Outliers: instances scored as outliers
 +
Inliers: instances not scored as outliers
  
Outputs
+
Widget [[Outliers]] menerapkan salah satu dari dua metode untuk mendeteksi outlier. Kedua metode menerapkan klasifikasi ke dataset, satu dengan SVM (multiple kernel) dan yang lainnya dengan elliptical envelope. SVM satu kelas dengan non-linear kernel (RBF) berkinerja baik untuk distribusi non-Gaussian, sedangkan estimator Covariance hanya berfungsi untuk data dengan distribusi Gaussian.
  
    Outliers: instances scored as outliers
+
[[File:Outliers-stamped.png|center|400px|thumb]]
  
    Inliers: instances not scored as outliers
 
  
The Outliers widget applies one of the two methods for outlier detection. Both methods apply classification to the dataset, one with SVM (multiple kernels) and the other with elliptical envelope. One-class SVM with non-linear kernels (RBF) performs well with non-Gaussian distributions, while Covariance estimator works only for data with Gaussian distribution.
+
* Information on the input data, number of inliers and outliers based on the selected model.
 
+
* Select the Outlier detection method:
[[File:Outliers-stamped.png|center|200px|thumb]]
+
** One class SVM with non-linear kernel (RBF): classifies data as similar or different from the core class:
 
+
*** Nu is a parameter for the upper bound on the fraction of training errors and a lower bound of the fraction of support vectors
 
+
*** Kernel coefficient is a gamma parameter, which specifies how much influence a single data instance has
    Information on the input data, number of inliers and outliers based on the selected model.
+
** Covariance estimator: fits ellipsis to central points with Mahalanobis distance metric
 
+
*** Contamination is the proportion of outliers in the dataset
    Select the Outlier detection method:
+
*** Support fraction specifies the proportion of points included in the estimate
 
+
* Produce a report.
        One class SVM with non-linear kernel (RBF): classifies data as similar or different from the core class:
+
* Click Detect outliers to output the data.
 
 
            Nu is a parameter for the upper bound on the fraction of training errors and a lower bound of the fraction of support vectors
 
 
 
            Kernel coefficient is a gamma parameter, which specifies how much influence a single data instance has
 
 
 
        Covariance estimator: fits ellipsis to central points with Mahalanobis distance metric
 
 
 
            Contamination is the proportion of outliers in the dataset
 
 
 
            Support fraction specifies the proportion of points included in the estimate
 
 
 
    Produce a report.
 
 
 
    Click Detect outliers to output the data.
 
  
 
==Contoh==
 
==Contoh==
  
Below, is a simple example of how to use this widget. We used the Iris dataset to detect the outliers. We chose the one class SVM with non-linear kernel (RBF) method, with Nu set at 20% (less training errors, more support vectors). Then we observed the outliers in the Data Table widget, while we sent the inliers to the Scatter Plot.
+
Di bawah, adalah contoh sederhana cara menggunakan widget Outliers. Kita menggunakan dataset Iris untuk mendeteksi outlier. Kita memilih satu kelas SVM dengan metode non-linear kernel(RBF), dengan Nu ditetapkan pada 20% (lebih sedikit kesalahan pelatihan, lebih banyak vektor dukungan). Kemudian kita mengamati [[Outliers]] di widget Data Table, sementara kita mengirim inliers ke widget Scatter Plot.
  
[[File:Outliers-Example.png|center|200px|thumb]]
+
[[File:Outliers-Example.png|center|600px|thumb]]
  
  

Latest revision as of 14:00, 18 April 2020

Sumber: https://docs.biolab.si//3/visual-programming/widgets/data/outliers.html

Widget Outliers mendeteksi outlier sederhana dengan membandingkan jarak antar instance.

Input

Data: input dataset
Distances: distance matrix

Output

Outliers: instances scored as outliers
Inliers: instances not scored as outliers

Widget Outliers menerapkan salah satu dari dua metode untuk mendeteksi outlier. Kedua metode menerapkan klasifikasi ke dataset, satu dengan SVM (multiple kernel) dan yang lainnya dengan elliptical envelope. SVM satu kelas dengan non-linear kernel (RBF) berkinerja baik untuk distribusi non-Gaussian, sedangkan estimator Covariance hanya berfungsi untuk data dengan distribusi Gaussian.

Outliers-stamped.png


  • Information on the input data, number of inliers and outliers based on the selected model.
  • Select the Outlier detection method:
    • One class SVM with non-linear kernel (RBF): classifies data as similar or different from the core class:
      • Nu is a parameter for the upper bound on the fraction of training errors and a lower bound of the fraction of support vectors
      • Kernel coefficient is a gamma parameter, which specifies how much influence a single data instance has
    • Covariance estimator: fits ellipsis to central points with Mahalanobis distance metric
      • Contamination is the proportion of outliers in the dataset
      • Support fraction specifies the proportion of points included in the estimate
  • Produce a report.
  • Click Detect outliers to output the data.

Contoh

Di bawah, adalah contoh sederhana cara menggunakan widget Outliers. Kita menggunakan dataset Iris untuk mendeteksi outlier. Kita memilih satu kelas SVM dengan metode non-linear kernel(RBF), dengan Nu ditetapkan pada 20% (lebih sedikit kesalahan pelatihan, lebih banyak vektor dukungan). Kemudian kita mengamati Outliers di widget Data Table, sementara kita mengirim inliers ke widget Scatter Plot.

Outliers-Example.png


Referensi

Pranala Menarik