Pm4py: analisa bottleneck dari csv

From OnnoWiki
Revision as of 07:17, 29 March 2025 by Onnowpurbo (talk | contribs) (Created page with "Berikut adalah contoh '''source code Python''' menggunakan pustaka '''PM4Py''' untuk menganalisis *bottleneck* dalam proses bisnis menggunakan file '''CSV''' dan menghasilkan...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Berikut adalah contoh source code Python menggunakan pustaka PM4Py untuk menganalisis *bottleneck* dalam proses bisnis menggunakan file CSV dan menghasilkan visualisasi bottleneck dalam bentuk grafik:

1. Install PM4Py (jika belum)

pip install pm4py

2. Contoh CSV Format

Pastikan CSV kamu punya format seperti ini:

case_id,activity,timestamp
1,A,2023-01-01 10:00:00
1,B,2023-01-01 11:00:00
1,C,2023-01-01 13:00:00
2,A,2023-01-01 09:00:00
2,B,2023-01-01 09:30:00
2,C,2023-01-01 12:30:00


3. Source Code Analisis Bottleneck + Visualisasi

import pandas as pd
from pm4py.objects.log.util import dataframe_utils
from pm4py.algo.filtering.log.variants import variants_filter
from pm4py.algo.discovery.dfg import algorithm as dfg_discovery
from pm4py.visualization.dfg import visualizer as dfg_visualization
from pm4py.algo.analysis.performance_spectrum import algorithm as performance_spectrum
import matplotlib.pyplot as plt

# Step 1: Load CSV
df = pd.read_csv("log.csv")

# Step 2: Pastikan kolom sesuai PM4Py format
df.columns = ['case:concept:name', 'concept:name', 'time:timestamp']
df['time:timestamp'] = pd.to_datetime(df['time:timestamp'])

# Step 3: Konversi ke Event Log
df = dataframe_utils.convert_timestamp_columns_in_df(df)
from pm4py.objects.log.importer.pandas import importer as pandas_importer
log = pandas_importer.apply(df)

# Step 4: Buat Directly-Follows Graph (DFG)
dfg = dfg_discovery.apply(log)

# Step 5: Visualisasi DFG
dfg_vis = dfg_visualization.apply(dfg, log=log, variant=dfg_visualization.Variants.FREQUENCY)
dfg_visualization.view(dfg_vis)

# Step 6: Hitung Performance Spectrum (Durasi antar aktivitas)
ps = performance_spectrum.apply(log) 

# Step 7: Visualisasi Bottleneck (Waktu antar aktivitas)
performance_spectrum.visualize(ps)

Output:

  • Gambar 1: DFG (Directly-Follows Graph) menunjukkan urutan aktivitas.
  • Gambar 2: Performance Spectrum Chart yang memperlihatkan durasi waktu antar aktivitas. Di sinilah kamu bisa melihat *bottleneck* (misal, waktu antar A ➝ B terlalu lama dibandingkan aktivitas lainnya).

Tips

  • Jika ingin melihat bottleneck lebih detail:
from pm4py.statistics.traces.log import case_statistics
variants_count = case_statistics.get_variant_statistics(log)
variants_count = sorted(variants_count, key=lambda x: x['count'], reverse=True)
for variant in variants_count:
    print(variant)


Pranala Menarik