-
Notifications
You must be signed in to change notification settings - Fork 550
Description
I have received this error fairly consistently over the past 2 months when using your "ChatML + chat templates + Mistral 7b full example" notebook and finetuning a number of different models from Gemma to Llama-2 & 3 and this was with a Mistral Model - this one to be exact => teknium/OpenHermes-2.5-Mistral-7B
Everything goes fine, except for being able to save the finetuned model, as a GGUF file always failing and giving this TypeError in regard to a duplicate sentencepiece file name. I have been and still am able to save the finetuned model in a non-gguf form (I call it its full form) to a hugging face repo, usually as 4bit. I then download it and convert it to a gguf file myself locally with the llama python script, so it was not a huge deal for me personally. But since I have seen the exact same error for a while (and I finally believe it is not due to something dumb I am doing - I think...), I thought you might want to know about it in case it was a bug or something you would want to or could address on your end.
The error is detailed below. And I have attached the full error below. It seems like when you perform your sentencepiece tokenizer fix you import => sentencepiece_model_pb2 and call this method => ModelProto()
Everything looks fine until a Google Library calls in dependences with sentence piece likely one of them and on Super call both this sentencepeice_model_pb2 and the one you imported get pushed (or pulled ?) to fill the global and is maybe where the duplicate error is coming from.
But then again, I have been learning and working on AI and Python for about a whole 3 months as of today. So, I could be embarrassingly wrong.
I did not really change much from the default notebook, with one difference being I used my own Jinja2 Chat Template and applied it to the tokenizer (although it really is just a subset of your main sloth template and seems pretty standard for a ChatML template).
It happens on this function call attempting to save the Model as a Q4_K_M Quantized GGUF file:
model.push_to_hub_gguf("b22000r/xyntrai-mistral-2.5-7b/", tokenizer, quantization_method = "q4_k_m", token = "mah'_hf_token")
Here is a link to the Colab Notebook:
https://colab.research.google.com/drive/1naZeXgRT7cEiOWR-C2RD2Uc9jSwCak_y?authuser=1#scrollTo=FqfebeAdT073
I will see if I can upload the notebook and attach it to this issue submission as well.
I am using a Windows 11 Machine with 32GB of RAM, No GPU locally (only baseline 128 MB Onboard) but I use a rented Google T4 in Colab.
This is the Error that is produced on screen in Colab:
Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 5.61 out of 12.67 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...
100%|██████████| 32/32 [02:54<00:00, 5.44s/it]
Unsloth: Saving tokenizer... Done.
Unsloth: Saving b22000/xyntrai-mistral-2.5-7b/pytorch_model.bin...
Done.
==((====))== Unsloth: Conversion from QLoRA to GGUF information
\\ /| [0] Installing llama.cpp might take 3 minutes.
O^O/ \_/ \ [1] Converting HF to GGUF 16bits might take 3 minutes.
\ / [2] Converting GGUF 16bits to ['q4_k_m'] might take 10 minutes each.
"-____-" In total, you will have to wait at least 16 minutes.
Unsloth: Installing llama.cpp. This might take 3 minutes...
Unsloth: [1] Converting model at b22000/xyntrai-mistral-2.5-7b into f16 GGUF format.
The output location will be /content/b22000/xyntrai-mistral-2.5-7b/unsloth.F16.gguf
This might take 3 minutes...
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
[/tmp/ipython-input-4103010664.py](https://localhost:8080/#) in <cell line: 0>()
7 is_sentencepiece = isinstance(tokenizer, AutoTokenizer) and tokenizer.vocab_file is not None
8 if True and not is_sentencepiece:
----> 9 model.push_to_hub_gguf("b22000/xyntrai-mistral-2.5-7b", tokenizer, quantization_method = "q4_k_m", token = "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")
10 elif True and is_sentencepiece:
11 print("Skipping GGUF push due to sentencepiece error.")
4 frames
[/usr/local/lib/python3.12/dist-packages/unsloth/save.py](https://localhost:8080/#) in unsloth_push_to_hub_gguf(self, repo_id, tokenizer, quantization_method, first_conversion, use_temp_dir, commit_message, private, token, max_shard_size, create_pr, safe_serialization, revision, commit_description, tags, temporary_location, maximum_memory_usage)
2075
2076 # Save to GGUF
-> 2077 all_file_locations, want_full_precision = save_to_gguf(
2078 model_type, model_dtype, is_sentencepiece_model,
2079 new_save_directory, quantization_method, first_conversion, makefile,
[/usr/local/lib/python3.12/dist-packages/unsloth/save.py](https://localhost:8080/#) in save_to_gguf(model_type, model_dtype, is_sentencepiece, model_directory, quantization_method, first_conversion, _run_installer)
1185 vocab_type = "spm,hfft,bpe"
1186 # Fix Sentencepiece model as well!
-> 1187 fix_sentencepiece_gguf(model_directory)
1188 else:
1189 vocab_type = "bpe"
[/usr/local/lib/python3.12/dist-packages/unsloth/tokenizer_utils.py](https://localhost:8080/#) in fix_sentencepiece_gguf(saved_location)
408 """
409 from copy import deepcopy
--> 410 from transformers.utils import sentencepiece_model_pb2
411 import json
412 from enum import IntEnum
[/usr/local/lib/python3.12/dist-packages/transformers/utils/sentencepiece_model_pb2.py](https://localhost:8080/#) in <module>
26
27
---> 28 DESCRIPTOR = _descriptor.FileDescriptor(
29 name="sentencepiece_model.proto",
30 package="sentencepiece",
[/usr/local/lib/python3.12/dist-packages/google/protobuf/descriptor.py](https://localhost:8080/#) in __new__(cls, name, package, options, serialized_options, serialized_pb, dependencies, public_dependencies, syntax, edition, pool, create_key)
1226 # pylint: disable=g-explicit-bool-comparison
1227 if serialized_pb:
-> 1228 return _message.default_pool.AddSerializedFile(serialized_pb)
1229 else:
1230 return super(FileDescriptor, cls).__new__(cls)
[TypeError: Couldn't build proto file into descriptor pool: duplicate file name sentencepiece_model.proto](url)