Text-to-Audio Models Information

Model Basic Info Model Configuration Training Data
Org. License Var. Params Arch. Source Dur.
AudioGen Meta CBN4 med 1.5B AR AS, AC + 8 oth. 6824
AudioLDM Surrey CBNS4 full 739M LDM AS, AC + 2 oth. 9031
AudioLDM 2 Surrey CBNS4 large 712M LDM AC, AS + 3 oth. 29510
Auffusion BUPT CBNS4 full 1.1B LDM AC, AS + 9 oth. 1990
MAGNeT Meta CBN4 med 1.5B NAR licensed data 16000
Make-An-Audio ZJU MIT 453M LDM AS, AC + 13 oth. ~3k
Make-An-Audio 2 ZJU MIT 937M LDM AS, AC + 10 oth. 3700
Stable-Audio Open Stability AI Comm. 1.0 1057M DiT FS, FMA 7300
Tango DeClaRe CBNS4 full 866M LDM AS, AC + 7 oth. 1.2M
Tango 2 DeClaRe CBNS4 full 866M LDM AL -