[NeurIPS 2025] Flow x RL. "ReinFlow: Fine-tuning Flow Policy with Online Reinforcement Learning". Support VLAs e.g., pi0, pi0.5. Fully open-sourced.
-
Updated
Mar 21, 2026 - Python
[NeurIPS 2025] Flow x RL. "ReinFlow: Fine-tuning Flow Policy with Online Reinforcement Learning". Support VLAs e.g., pi0, pi0.5. Fully open-sourced.
building a simple VLM. Implementing LlaMA-SmolLM2 from scratch + SigLip2 Vision Model. KV-Caching is supported and implemented from scratch as well
building a simple VLM. Implementing LlaMA-SmolLM2 from scratch + SigLip2 Vision Model. KV-Caching is supported and implemented from scratch as well
A Python script to analyze images generated using a LoRA (Low-Rank Adaptation) model applied at various strength levels. This tool helps determine an optimal strength for a given LoRA by evaluating image quality and similarity to control images.
Fine-tuned 3B parameters PaliGemma2 vision model on Valorant object detection improving IoU scores across all classes. Project is developed for research experimentation.
Multimodal Medical AI Fine-Tuned on Qwen-2.5-VL-7B with LoRA + Medical Distillation
Building models from scratch and tuning pre-trained models to recognise different house cats
Fine-tuning LiquidAI/LFM2-VL-1.6B in Colab (LoRA/4-bit) + dataset template + probe test.
Fine-tuning DINO object detection model on a COCO-annotated pedestrian dataset from IIT Delhi. Includes data prep, training, evaluation, and visualization scripts.
A fine tuned YOLO11 model up to 100 epochs. This custom dataset based fine tuned yolo11s is down streamed on the task of traffic signals detection in both images, videos. Furthermore, the model has been exported to the ONNX format as well. You may export it to your desired serialization format.
This repository includes of a Multi-Tag (acronyms are Multi-Task and Multi-Output as well) Image Classification on Fashion Products Images dataset on Kaggle using EfficientNetB0 with high accuracies
PyTorch Native finetuning of Multimodal Instruction tuned model (Gemma 3) from Google.
A toolkit for training and fine-tuning diffusion model LoRAs.
Add a description, image, and links to the finetuning-vision-models topic page so that developers can more easily learn about it.
To associate your repository with the finetuning-vision-models topic, visit your repo's landing page and select "manage topics."