Difference between revisions of "Data Science: Membuat Model Machine Learning"

Latest revision as of 14:50, 11 February 2025

Membuat Model Machine Learning

The figure provides a visual overview of the machine learning model-building process, illustrating key steps from data preprocessing to model evaluation.

1. Initial Dataset and Preprocessing

The process starts with an initial dataset containing input variables (X) and an output variable (Y).
Exploratory Data Analysis (EDA) is performed using techniques like PCA (Principal Component Analysis) and SOM (Self-Organizing Map).
Data cleaning, data curation, and removal of redundant features are carried out to prepare a pre-processed dataset.

2. Data Splitting

The dataset is split into:
- 80% Training Set (used for training the model).
- 20% Test Set (used for evaluating the model).

3. Model Training and Optimization

Different learning algorithms such as Support Vector Machine (SVM), Deep Learning (DL), Random Forest (RF), Decision Tree (DT), Gradient Boosting Machine (GBM), and K-Nearest Neighbors (KNN) are applied.
Hyperparameter optimization is conducted to improve model performance.
Feature selection helps in choosing the most important variables.
Cross-validation ensures the reliability of the trained model.

4. Model Evaluation

The trained model produces predicted Y values and is evaluated based on:
- Classification Metrics: Accuracy, Sensitivity, Specificity, and Matthew's Correlation Coefficient (MCC).
- Regression Metrics: Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R² (coefficient of determination).

5. Final Evaluation

The model’s performance is assessed, determining whether it meets the desired accuracy and reliability before deployment.

This visual guide simplifies the machine learning workflow, making it easier to understand the core steps involved in training and evaluating models effectively.

Pranala Menarik

Data Science

@@ Line 1: / Line 1: @@
 [[File:Image e4a257c0-1b12-49e8-9af1-07a01b892e5820200115 040958.jpg|center|600px|thumb|Membuat Model Machine Learning]]
+The figure provides a '''visual overview of the machine learning model-building process''', illustrating key steps from data preprocessing to model evaluation.
+=='''1. Initial Dataset and Preprocessing'''==
+* The process starts with an '''initial dataset''' containing '''input variables (X)''' and an '''output variable (Y)'''.
+* '''Exploratory Data Analysis (EDA)''' is performed using techniques like '''PCA (Principal Component Analysis)''' and '''SOM (Self-Organizing Map)'''.
+* '''Data cleaning''', '''data curation''', and '''removal of redundant features''' are carried out to prepare a '''pre-processed dataset'''.
+=='''2. Data Splitting'''==
+* The dataset is split into:
+** '''80% Training Set''' (used for training the model).
+** '''20% Test Set''' (used for evaluating the model).
+=='''3. Model Training and Optimization'''==
+* Different '''learning algorithms''' such as '''Support Vector Machine (SVM), Deep Learning (DL), Random Forest (RF), Decision Tree (DT), Gradient Boosting Machine (GBM), and K-Nearest Neighbors (KNN)''' are applied.
+* '''Hyperparameter optimization''' is conducted to improve model performance.
+* '''Feature selection''' helps in choosing the most important variables.
+* '''Cross-validation''' ensures the reliability of the trained model.
+=='''4. Model Evaluation'''==
+* The trained model produces '''predicted Y values''' and is evaluated based on:
+** '''Classification Metrics''': '''Accuracy, Sensitivity, Specificity, and Matthew's Correlation Coefficient (MCC)'''.
+** '''Regression Metrics''': '''Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R² (coefficient of determination)'''.
+=='''5. Final Evaluation'''==
+* The model’s '''performance is assessed''', determining whether it meets the desired accuracy and reliability before deployment.
+This visual guide simplifies the '''machine learning workflow''', making it easier to understand the core steps involved in training and evaluating models effectively.
 ==Pranala Menarik==
 * [[Data Science]]