【Streamlit】gemma-3-1b-ptモデルを使ったチャットアプリ

こんにちは、JS2IIUです。
25年3月12日にGoogleからGemma3のリリースがありました。軽量かつ高速ということで、実際に試してみたいと思います。今回もよろしくお願いします。

1. はじめに
1. gemma-3-1b-ptモデルとは？
2. 必要なライブラリのインストール
3. gemma-3-1b-ptモデルを使ったチャットアプリ
4. コード解説
5. よくあるエラー
1. OSError
2. ValueError
6. 参考リンク

1. はじめに

本記事では、Googleが提供する自然言語処理モデル gemma-3-1b-pt を利用し、Streamlit 上で動作するチャットアプリを作成します。

本アプリは、ユーザーからの入力に対してgemma-3-1b-ptモデルが自動で応答を生成します。

gemma-3-1b-ptモデルとは？

gemma-3-1b-pt は、Googleが開発したパラメータ数13億（1.3B）の大規模言語モデル（LLM）であり、以下の特長を持ちます。

事前学習済み（Pre-trained）：インターネット上の多様なデータセットで学習済みで、幅広い自然言語処理タスクに対応
軽量モデル：13億パラメータのため、大型LLMと比較して推論速度が速く、消費リソースが少ない
応用範囲の広さ：会話生成、要約、質問応答など、さまざまなテキスト生成タスクに対応

公式ページ：google/gemma-3-1b-pt

2. 必要なライブラリのインストール

まずは必要なPythonライブラリをインストールします。

Python

pip install streamlit transformers torch

3. gemma-3-1b-ptモデルを使ったチャットアプリ

モデルの読み込みから実行まで、ノートパソコンで実行している割には早く動いている様に感じました。

以下がgemma-3-1b-ptモデルを使ったStreamlitチャットアプリの全コードです。

Python

import streamlit as st
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# モデル名
model_name = "google/gemma-3-1b-pt"

# トークナイザーとモデルの読み込み
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Streamlitアプリの設定
st.title("Gemma-3-1b-pt チャットアプリ")
st.write("ユーザー入力に対してGemma-3-1b-ptモデルが応答します。")

# チャット履歴を保存するためのセッション状態を初期化
if "chat_history" not in st.session_state:
    st.session_state.chat_history = []

# ユーザー入力の取得
user_input = st.text_input("あなた: ", "")

# ユーザーが入力を送信した場合
if user_input:
    # モデルへの入力を準備
    prompt = user_input
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    # モデルによるテキスト生成
    with torch.no_grad():
        outputs = model.generate(
            inputs.input_ids,
            max_new_tokens=50,
            pad_token_id=tokenizer.eos_token_id
        )

    # モデルの応答をデコード
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)

    # チャット履歴にユーザー入力とモデルの応答を追加
    st.session_state.chat_history.append((user_input, response))

# チャット履歴の表示
for i, (user_msg, bot_msg) in enumerate(st.session_state.chat_history):
    st.write(f"**あなた:** {user_msg}")
    st.write(f"**Gemma-3-1b-pt:** {bot_msg}")

4. コード解説

4.1 モデルとトークナイザーの読み込み

Python

model_name = "google/gemma-3-1b-pt"

# トークナイザーとモデルをHugging Faceからロード
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

AutoTokenizer: 入力テキストをモデルが理解できるトークンに変換。
AutoModelForCausalLM: 文章生成に適した因果言語モデル（Causal Language Model）をロード。
torch_dtype=torch.bfloat16: メモリ使用量を減らすため、bfloat16フォーマットを使用。
device_map=”auto”: GPUが利用可能であれば自動で使用。

4.2 StreamlitのUI設定

Python

st.title("Gemma-3-1b-pt チャットアプリ")
st.write("ユーザー入力に対してGemma-3-1b-ptモデルが応答します。")

st.title(): アプリのタイトルを表示。
st.write(): 説明文を表示。

4.3 チャット履歴の管理

Python

if "chat_history" not in st.session_state:
    st.session_state.chat_history = []

st.session_state: Streamlitでアプリの状態を保存するための仕組み。

4.4 モデルへの入力と生成

Python

inputs = tokenizer(user_input, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        inputs.input_ids,
        max_new_tokens=50,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)

user_input: ユーザーからの入力。
model.generate(): モデルによる応答を生成。

4.5 チャット履歴の表示

Python

for i, (user_msg, bot_msg) in enumerate(st.session_state.chat_history):
    st.write(f"**あなた:** {user_msg}")
    st.write(f"**Gemma-3-1b-pt:** {bot_msg}")

st.write(): チャット履歴を逐次表示。

5. よくあるエラー

OSError

以下のエラーは、HuggingFaceにログインせずに実行した時に発生しました。

Python

OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like google/gemma-3-1b-pt is not the path to a directory containing a file named config.json.

解決方法：

Hugging Faceにログインしてからモデルを使用します。

Hugging Faceでアカウントを作成し、アクセストークンを生成する:
→ Hugging Face Tokens
アクセストークンを使ってログイン:

Bash

huggingface-cli login

ValueError

以下のエラーが出たときは、tranformersをアップデートします。

Bash

ValueError: The checkpoint you are trying to load has model type gemma3_text but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date. You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command pip install git+https://github.com/huggingface/transformers.git

Bash

pip uninstall transformers
pip install git+https://github.com/huggingface/transformers.git

6. 参考リンク

最後に、書籍のPRです。

最新のOpenAIのチャットAPIの使い方もしっかりと解説されている良書です。2024年11月初版発行、「LangChainとLangGraphによるRAG・AIエージェント[実践]入門」西見、吉田、大嶋著。

最後まで読んでいただきありがとうございます。

月	火	水	木	金	土	日
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31