A 7B VLA model running on a commercial-grade GPU like the RTX 4090 generally achieves an inference frequency of approximately 5-12 Hz.
Training Parameters
Data Collator
Core functionality:
- Batch creation.
- Padding. Ensuring all samples in a batch (such as variable-length text inputs) have the same dimensions.
- Convert Python objects to PyTorch.
- Create attention masks or add special tokens.