<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://onnocenter.or.id/wiki/index.php?action=history&amp;feed=atom&amp;title=LLM%3A_create_model_tanpa_huggingface</id>
	<title>LLM: create model tanpa huggingface - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://onnocenter.or.id/wiki/index.php?action=history&amp;feed=atom&amp;title=LLM%3A_create_model_tanpa_huggingface"/>
	<link rel="alternate" type="text/html" href="https://onnocenter.or.id/wiki/index.php?title=LLM:_create_model_tanpa_huggingface&amp;action=history"/>
	<updated>2026-04-17T17:43:36Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.35.4</generator>
	<entry>
		<id>https://onnocenter.or.id/wiki/index.php?title=LLM:_create_model_tanpa_huggingface&amp;diff=72643&amp;oldid=prev</id>
		<title>Onnowpurbo: /* Tips Tambahan: */</title>
		<link rel="alternate" type="text/html" href="https://onnocenter.or.id/wiki/index.php?title=LLM:_create_model_tanpa_huggingface&amp;diff=72643&amp;oldid=prev"/>
		<updated>2025-05-05T09:18:26Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Tips Tambahan:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left diff-editfont-monospace&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 09:18, 5 May 2025&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l108&quot; &gt;Line 108:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 108:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;   &lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;   &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Hindari Hugging Face dengan '''manual download model weight''' dari GitHub mirror atau torrent repo LLaMA.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Hindari Hugging Face dengan '''manual download model weight''' dari GitHub mirror atau torrent repo LLaMA.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;==Pranala Menarik==&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* [[LLM]]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Onnowpurbo</name></author>
	</entry>
	<entry>
		<id>https://onnocenter.or.id/wiki/index.php?title=LLM:_create_model_tanpa_huggingface&amp;diff=72642&amp;oldid=prev</id>
		<title>Onnowpurbo: Created page with &quot;Untuk membuat '''library atau model Ollama''' berbahasa Indonesia tanpa bergantung pada '''Hugging Face''', kamu bisa membangun atau melatih (fine-tune) model LLM secara lokal...&quot;</title>
		<link rel="alternate" type="text/html" href="https://onnocenter.or.id/wiki/index.php?title=LLM:_create_model_tanpa_huggingface&amp;diff=72642&amp;oldid=prev"/>
		<updated>2025-05-05T09:16:38Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;Untuk membuat &amp;#039;&amp;#039;&amp;#039;library atau model Ollama&amp;#039;&amp;#039;&amp;#039; berbahasa Indonesia tanpa bergantung pada &amp;#039;&amp;#039;&amp;#039;Hugging Face&amp;#039;&amp;#039;&amp;#039;, kamu bisa membangun atau melatih (fine-tune) model LLM secara lokal...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;Untuk membuat '''library atau model Ollama''' berbahasa Indonesia tanpa bergantung pada '''Hugging Face''', kamu bisa membangun atau melatih (fine-tune) model LLM secara lokal, lalu menjalankannya melalui '''Ollama'''. Berikut adalah '''langkah-langkah ringkas dan teknisnya''':&lt;br /&gt;
&lt;br /&gt;
==Tujuan:==&lt;br /&gt;
&lt;br /&gt;
Membuat dan menjalankan model LLM Bahasa Indonesia secara lokal dengan Ollama, tanpa upload ke Hugging Face.&lt;br /&gt;
&lt;br /&gt;
==PRA-SYARAT:==&lt;br /&gt;
&lt;br /&gt;
* '''Ubuntu 24.04''' (atau OS lain yang kompatibel)&lt;br /&gt;
* '''Docker''' (opsional, tapi disarankan)&lt;br /&gt;
* '''Python + venv''' (untuk preprocessing dan training)&lt;br /&gt;
* '''Ollama''' sudah terinstall&lt;br /&gt;
* '''Dataset Bahasa Indonesia''' (misalnya IndoSum, Wikipedia dump, OSCAR ID, dll.)&lt;br /&gt;
&lt;br /&gt;
==ALUR UMUM:==&lt;br /&gt;
&lt;br /&gt;
* '''Siapkan dataset Bahasa Indonesia'''&lt;br /&gt;
* '''Preprocessing teks''' dan '''tokenisasi'''&lt;br /&gt;
* '''Fine-tune model LLaMA/Mistral''' secara lokal&lt;br /&gt;
* '''Convert ke GGUF format'''&lt;br /&gt;
* '''Buat model file untuk Ollama (`Modelfile`)'''&lt;br /&gt;
* '''Load ke Ollama lokal'''&lt;br /&gt;
&lt;br /&gt;
==Ambil Dataset Bahasa Indonesia==&lt;br /&gt;
&lt;br /&gt;
Contoh:&lt;br /&gt;
&lt;br /&gt;
* [OSCAR Indonesia](https://huggingface.co/datasets/oscar) (bisa download manual dari web atau wget)&lt;br /&gt;
* IndoSum (abstractive summary corpus)&lt;br /&gt;
* Wikipedia Bahasa Indonesia dump: [https://dumps.wikimedia.org/idwiki/latest/](https://dumps.wikimedia.org/idwiki/latest/)&lt;br /&gt;
&lt;br /&gt;
Jika tidak mau dari HuggingFace, kamu bisa:&lt;br /&gt;
&lt;br /&gt;
 wget https://data.statmt.org/oscar/corpus_id.txt.gz&lt;br /&gt;
 gunzip corpus_id.txt.gz&lt;br /&gt;
&lt;br /&gt;
==Preprocessing Dataset==&lt;br /&gt;
&lt;br /&gt;
Gunakan Python:&lt;br /&gt;
&lt;br /&gt;
 import re&lt;br /&gt;
 &lt;br /&gt;
 def clean_text(text):&lt;br /&gt;
     text = re.sub(r&amp;quot;http\S+&amp;quot;, &amp;quot;&amp;quot;, text)&lt;br /&gt;
     text = re.sub(r&amp;quot;\s+&amp;quot;, &amp;quot; &amp;quot;, text)&lt;br /&gt;
     return text.strip()&lt;br /&gt;
 &lt;br /&gt;
 with open(&amp;quot;corpus_id.txt&amp;quot;, &amp;quot;r&amp;quot;) as infile, open(&amp;quot;cleaned.txt&amp;quot;, &amp;quot;w&amp;quot;) as outfile:&lt;br /&gt;
     for line in infile:&lt;br /&gt;
         cleaned = clean_text(line)&lt;br /&gt;
         if len(cleaned.split()) &amp;gt; 5:&lt;br /&gt;
             outfile.write(cleaned + &amp;quot;\n&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
==Fine-tune Model (LLaMA/Mistral)==&lt;br /&gt;
&lt;br /&gt;
Gunakan '''llama.cpp''' atau '''llama-factory''':&lt;br /&gt;
&lt;br /&gt;
* [https://github.com/ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp)&lt;br /&gt;
* [https://github.com/huggingface/transformers](https://github.com/huggingface/transformers) atau alternatif lokal&lt;br /&gt;
&lt;br /&gt;
Contoh fine-tune dengan `llama-factory` (tanpa upload ke HF):&lt;br /&gt;
&lt;br /&gt;
 git clone https://github.com/hiyouga/LLaMA-Factory&lt;br /&gt;
 cd LLaMA-Factory&lt;br /&gt;
 pip install -r requirements.txt&lt;br /&gt;
 &lt;br /&gt;
 # Latih model dengan dataset lokal&lt;br /&gt;
 python src/train_bash.py \&lt;br /&gt;
   --model_name_or_path &amp;quot;TheBloke/Mistral-7B-Instruct-v0.1&amp;quot; \&lt;br /&gt;
   --dataset_dir ./my_data/ \&lt;br /&gt;
   --template mistral \&lt;br /&gt;
   --finetuning_type lora \&lt;br /&gt;
   --output_dir ./output-id \&lt;br /&gt;
   --cutoff_len 512 \&lt;br /&gt;
   --fp16&lt;br /&gt;
&lt;br /&gt;
==Convert ke GGUF==&lt;br /&gt;
&lt;br /&gt;
Gunakan `llama.cpp` untuk konversi model hasil fine-tune:&lt;br /&gt;
&lt;br /&gt;
 git clone https://github.com/ggerganov/llama.cpp&lt;br /&gt;
 cd llama.cpp&lt;br /&gt;
 python3 convert.py --outtype f16 --outfile model-id.gguf ./output-id&lt;br /&gt;
&lt;br /&gt;
==Buat Modelfile untuk Ollama==&lt;br /&gt;
&lt;br /&gt;
Buat file bernama `Modelfile`:&lt;br /&gt;
&lt;br /&gt;
 FROM ./model-id.gguf&lt;br /&gt;
 PARAMETER stop=[&amp;lt;/s&amp;gt;] prompt-template=&amp;quot;### Pertanyaan:\n{{prompt}}\n\n### Jawaban:&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Lalu build:&lt;br /&gt;
&lt;br /&gt;
 ollama create indollama -f Modelfile&lt;br /&gt;
&lt;br /&gt;
==Jalankan Model di Ollama==&lt;br /&gt;
&lt;br /&gt;
 ollama run indollama&lt;br /&gt;
&lt;br /&gt;
Sekarang kamu sudah punya model Bahasa Indonesia lokal, 100% tanpa Hugging Face.&lt;br /&gt;
&lt;br /&gt;
==Tips Tambahan:==&lt;br /&gt;
&lt;br /&gt;
* Untuk model dasar, kamu bisa pakai model open-source seperti:&lt;br /&gt;
&lt;br /&gt;
 mistral-7b&lt;br /&gt;
 llama-2-7b&lt;br /&gt;
 &lt;br /&gt;
* Hindari Hugging Face dengan '''manual download model weight''' dari GitHub mirror atau torrent repo LLaMA.&lt;/div&gt;</summary>
		<author><name>Onnowpurbo</name></author>
	</entry>
</feed>