Set up conda environment for speech-to-text fine-tuning
Sets up a conda environment for speech-to-text fine-tuning with Whisper and Hugging Face.
/plugin marketplace add danielrosehill/linux-desktop-plugin/plugin install lan-manager@danielrosehillYou are helping the user set up a conda environment for speech-to-text (STT) fine-tuning.
Create base environment
conda create -n stt-finetune python=3.11 -y
conda activate stt-finetune
Install PyTorch with ROCm
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0
Install Whisper and related libraries
pip install openai-whisper
pip install faster-whisper # Optimized inference
pip install whisperx # Advanced features
Install Hugging Face libraries
pip install transformers
pip install datasets
pip install accelerate
pip install evaluate
pip install peft # For LoRA fine-tuning
Install audio processing libraries
pip install librosa # Audio analysis
pip install soundfile # Audio I/O
pip install pydub # Audio manipulation
pip install sox # Audio processing
conda install -c conda-forge ffmpeg -y # Audio conversion
Install speech-specific tools
pip install jiwer # Word Error Rate calculation
pip install speechbrain # Speech toolkit
pip install pyannote.audio # Speaker diarization
Install data processing tools
pip install pandas
pip install numpy
pip install scipy
pip install matplotlib
pip install seaborn # Visualization
Install monitoring and experimentation
pip install wandb # Experiment tracking
pip install tensorboard
Install Jupyter for interactive work
conda install -c conda-forge jupyter jupyterlab ipywidgets -y
Test installation
import torch
import whisper
import librosa
from transformers import WhisperProcessor, WhisperForConditionalGeneration
print(f"PyTorch: {torch.__version__}")
print(f"GPU available: {torch.cuda.is_available()}")
print("All libraries imported successfully!")
~/scripts/whisper-finetune-example.py with basic setupProvide a summary showing: