Orange: ROC Analysis
Sumber: https://docs.biolab.si//3/visual-programming/widgets/evaluate/rocanalysis.html
Widget ROC Analysis mem-plot true positive rate terhadap false positive rate di sebuah test.
Input
Evaluation Results: results of testing classification algorithms
Widget ROC Analysis menunjukkan ROC curve untuk model yang diuji dan kecembungan yang sesuai. Ini berfungsi sebagai rata-rata perbandingan antara classification model. Kurva memplot false positive rate pada sumbu x (1-spesifisitas; probabilitas target = 1 ketika nilai true = 0) terhadap true positive rate pada sumbu y (sensitivitas; probabilitas target itu = 1 ketika nilai sebenarnya = 1). Semakin dekat kurva mengikuti batas kiri dan kemudian batas atas ruang ROC, semakin akurat classifier tersebut. Melalui cost dari false positive dan false negative, widget ROC Analysis juga dapat menentukan classifer dan threshold yang optimal.
- Choose the desired Target Class. The default class is chosen alphabetically.
- If test results contain more than one classifier, the user can choose which curves she or he wants to see plotted. Click on a classifier to select or deselect it.
- When the data comes from multiple iterations of training and testing, such as k-fold cross validation, the results can be (and usually are) averaged.
- The averaging options are:
- Merge predictions from folds (top left), which treats all the test data as if they came from a single iteration
- Mean TP rate (top right) averages the curves vertically, showing the corresponding confidence intervals
- Mean TP and FP at threshold (bottom left) traverses over threshold, averages the positions of curves and shows horizontal and vertical confidence intervals
- Show individual curves (bottom right) does not average but prints all the curves instead
- Option Show convex ROC curves refers to convex curves over each individual classifier (the thin lines positioned over curves). Show ROC convex hull plots a convex hull combining all classifiers (the gray area below the curves). Plotting both types of convex curves makes sense since selecting a threshold in a concave part of the curve cannot yield optimal results, disregarding the cost matrix. Besides, it is possible to reach any point on the convex curve by combining the classifiers represented by the points on the border of the concave region.
Garis putus-putus diagonal mewakili perilaku random classifier. Garis diagonal penuh mewakiliiso-performance. Simbol “A” hitam di bagian bawah graph secara proporsional menyesuaikan graph.
- The final box is dedicated to the analysis of the curve. The user can specify the cost of false positives (FP) and false negatives (FN), and the prior target class probability.
- Default threshold (0.5) point shows the point on the ROC curve achieved by the classifier if it predicts the target class if its probability equals or exceeds 0.5.
- Show performance line shows iso-performance in the ROC space so that all the points on the line give the same profit/loss. The line further to the upper left is better than the one down and right. The direction of the line depends upon costs and probabilities. This gives a recipe for depicting the optimal threshold for the given costs: this is the point where the tangent with the given inclination touches the curve and it is marked in the plot. If we push the iso-performance higher or more to the left, the points on the iso-performance line cannot be reached by the learner. Going down or to the right, decreases the performance.
- The widget allows setting the costs from 1 to 1000. Units are not important, as are not the magnitudes. What matters is the relation between the two costs, so setting them to 100 and 200 will give the same result as 400 and 800. Defaults: both costs equal (500), Prior target class probability 50%(from the data).
- False positive cost: 830, False negative cost 650, Prior target class probability 73%.
- Press Save Image if you want to save the created image to your computer in a .svg or .png format.
- Produce a report.
Contoh
Saat ini, satu-satunya widget yang memberikan jenis sinyal yang tepat yang dibutuhkan oleh ROC Analysis adalah Test & Score. Di bawah ini, kita membandingkan dua classifier, yaitu Tree dan Naive Bayes, dalam Test&Score dan kemudian membandingkan kinerjanya dalam ROC Analysis, Life Curve dan Calibration Plot.