Text-to-Audio Models Information
Model |
Basic Info |
Model Configuration |
Training Data |
Org. |
License |
Var. |
Params |
Arch. |
Source |
Dur. |
AudioGen |
Meta |
CBN4 |
med |
1.5B |
AR |
AS, AC + 8 oth. |
6824 |
AudioLDM |
Surrey |
CBNS4 |
full |
739M |
LDM |
AS, AC + 2 oth. |
9031 |
AudioLDM 2 |
Surrey |
CBNS4 |
large |
712M |
LDM |
AC, AS + 3 oth. |
29510 |
Auffusion |
BUPT |
CBNS4 |
full |
1.1B |
LDM |
AC, AS + 9 oth. |
1990 |
MAGNeT |
Meta |
CBN4 |
med |
1.5B |
NAR |
licensed data |
16000 |
Make-An-Audio |
ZJU |
MIT |
— |
453M |
LDM |
AS, AC + 13 oth. |
~3k |
Make-An-Audio 2 |
ZJU |
MIT |
— |
937M |
LDM |
AS, AC + 10 oth. |
3700 |
Stable-Audio Open |
Stability AI |
Comm. |
1.0 |
1057M |
DiT |
FS, FMA |
7300 |
Tango |
DeClaRe |
CBNS4 |
full |
866M |
LDM |
AS, AC + 7 oth. |
1.2M |
Tango 2 |
DeClaRe |
CBNS4 |
full |
866M |
LDM |
AL |
- |