Text-to-Audio Models Robustness Leaderboard
| System | Uppercase | Synonym | Misspelling | Whitespace | Rewrite | Punctuation |
|---|---|---|---|---|---|---|
| AudioGen | 0.992 | 0.978 | 0.985 | 0.978 | 1.000 | 1.074 |
| AudioLDM | 1.032 | 0.955 | 0.962 | 0.955 | 1.041 | 0.992 |
| AudioLDM 2 | 1.076 | 1.060 | 1.007 | 0.986 | 1.014 | 1.060 |
| Auffusion | 1.078 | 0.979 | 0.939 | 1.000 | 1.053 | 0.979 |
| MAGNeT | 0.951 | 0.815 | 0.843 | 0.882 | 0.829 | 0.815 |
| Make-An-Audio | 0.979 | 0.986 | 0.965 | 0.993 | 0.958 | 0.972 |
| Make-An-Audio-2 | 1.156 | 1.146 | 1.175 | 1.146 | 1.175 | 1.175 |
| Stable Audio Open | 1.172 | 1.124 | 1.101 | 1.069 | 1.028 | 1.079 |
| Tango | 1.140 | 1.078 | 1.022 | 1.000 | 1.030 | 1.045 |
| Tango 2 | 1.214 | 1.084 | 1.022 | 0.986 | 1.007 | 1.022 |