transformers/trainer.py : 'int' object has no attribute 'mean'

2 hours ago 1
ARTICLE AD BOX

I'm trying to train Orpheus-3b using a modified version of Unsloth notebook. I've already created the dataset. Pushed the dataset to HuggingFace Hub.Converted the dataset to tensors , tokenized the dataset entries. Added some prints in the code to check everything, but with no luck. I;m bot a python dev, but I know some python. Any help would be much appreciated. The blocks that keep failing whatever I do are:

from transformers import TrainingArguments, Trainer, DataCollatorForLanguageModeling from unsloth import is_bfloat16_supported print(processed_train_ds[0]) data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False) def orpheus_data_collator(features): batch = { "input_ids": torch.stack([f["input_ids"] for f in features]), "attention_mask": torch.stack([f["attention_mask"] for f in features]), "labels": torch.stack([f["labels"] for f in features]), } return batch trainer = Trainer( model = model, train_dataset = processed_train_ds, data_collator = data_collator, #eval_dataset = processed_eval_ds, args = TrainingArguments( per_device_train_batch_size = 8, # 16 fine for orpheus for 1024 seq length on 40 GB VRAM. 8 for 2048 gradient_accumulation_steps = 4, warmup_steps = 5, # max_steps = 60, num_train_epochs = 3, #eval_strategy = "steps", #eval_steps = 0.2, learning_rate = 2e-4, # try 2e-5 for FFT, 2e-4 for LoRA fp16 = not is_bfloat16_supported(), bf16 = is_bfloat16_supported(), logging_steps = 1, optim = "adamw_8bit", weight_decay = 0.01, # Turn this on if overfitting lr_scheduler_type = "constant", seed = 3407, output_dir = "outputs", report_to = "none", # Use this for WandB or Tensorboard. logging_dir = f"logs/{run_name}" ) ) x = processed_train_ds[0] print(x["labels"].unique()) print(x["labels"].dtype) trainer_stats = trainer.train()

The error is :

tensor([ -100, 11, 13, ..., 156916, 156924, 156927]) torch.int64 ==((====))== Unsloth - 2x faster free finetuning | Num GPUs used = 1 \\ /| Num examples = 7,064 | Num Epochs = 3 | Total steps = 333 O^O/ \_/ \ Batch size per device = 16 | Gradient accumulation steps = 4 \ / Data Parallel GPUs = 1 | Total batch size (16 x 4 x 1) = 64 "-____-" Trainable parameters = 97,255,424 of 3,398,122,496 (2.86% trained) --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[58], line 4 2 print(x["labels"].unique()) 3 print(x["labels"].dtype) ----> 4 trainer_stats = trainer.train() File ~/anaconda3/envs/py312/lib/python3.12/site-packages/transformers/trainer.py:2325, in Trainer.train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs) 2323 hf_hub_utils.enable_progress_bars() 2324 else: -> 2325 return inner_training_loop( 2326 args=args, 2327 resume_from_checkpoint=resume_from_checkpoint, 2328 trial=trial, 2329 ignore_keys_for_eval=ignore_keys_for_eval, 2330 ) File <string>:330, in _fast_inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval) File <string>:71, in _unsloth_training_step(***failed resolving arguments***) AttributeError: 'int' object has no attribute 'mean'```
Read Entire Article