Model Overall Basic Following Advanced Following Designer
Avg Attribute Relation Reasoning Avg Attribute
+Relation
Attribute
+Reasoning
Relation
+Reasoning
Style Text Real World
short long short long short long short long short long short long short long short long short long short long short long short long
Diffusion based Models (GPT-4o Evaluation)
FLUX.1 dev 71.1771.78 83.2378.65 87.1783.17 87.3980.39 75.1472.39 65.7968.54 67.0773.69 73.8473.34 69.0971.59 66.6766.67 43.8352.83 70.7271.47
SD XL 54.9642.13 65.7253.28 59.3350.83 77.5762.57 60.3246.57 49.7336.22 47.8235.57 56.2245.34 52.5936.09 73.3360.00 16.830.83 50.9241.59
SD 3 67.4666.09 78.3277.75 83.3379.83 82.0778.82 71.0774.07 61.4659.56 61.0764.07 68.8470.34 50.9657.84 66.6776.67 59.8320.83 63.2367.34
SD 3.5 71.1566.96 78.3479.56 79.5076.50 80.9683.21 72.4678.71 67.6761.18 66.4661.89 73.5374.15 60.0361.53 73.3363.33 70.5242.52 64.4366.39
SANA Sprint 63.6858.50 76.5871.00 75.3371.33 81.8272.07 72.5769.57 57.6751.80 55.3254.94 68.4666.72 62.5963.46 80.0060.00 8.835.83 66.9658.01
SANA 1.5 67.1565.73 79.6677.08 79.8377.83 85.5783.57 73.5769.82 61.5060.67 65.3256.57 69.9673.09 62.9665.84 80.0080.00 17.8315.83 71.0768.83
Playground v2 45.6452.78 59.8369.58 51.3366.33 70.5776.07 57.5766.32 38.4344.75 41.5745.57 48.9659.97 41.7253.84 53.3360.00 0.000.83 45.3246.44
Playground v2.5 47.7354.82 63.0868.08 57.8373.83 71.8277.32 59.5753.07 40.7348.17 39.7045.82 49.5964.22 44.2246.72 60.0080.00 0.004.83 47.1947.56
PixArt-delta 41.0148.24 53.8359.25 46.3352.83 62.0771.32 53.0753.57 34.6042.77 32.4437.44 53.5956.59 36.9649.46 46.6773.33 0.000.00 38.2340.10
PixArt-alpha 44.3750.50 55.5061.00 52.3356.33 63.8274.07 50.3252.57 38.7144.90 37.8241.32 58.8452.46 40.2247.09 50.0076.67 0.000.83 45.7053.16
PixArt-sigma 62.0058.12 70.6675.25 69.3378.83 75.0777.32 67.5769.57 57.6549.50 65.2056.57 66.9661.72 66.5954.59 83.3370.00 1.831.83 62.1152.41
LUMINA-Next 50.9352.46 64.5866.08 56.8359.33 67.5771.82 69.3267.07 44.7545.63 51.4443.20 51.0959.72 44.7254.46 70.0066.67 0.000.83 47.5649.05
Hunyuan-DiT 51.3853.28 69.3369.00 65.8369.83 78.0773.82 64.0763.32 42.6245.45 50.2041.57 59.2261.84 47.8451.09 56.6773.33 0.000.83 40.1044.20
AR based Models (GPT-4o Evaluation)
Llamagen 41.6738.22 53.0050.00 48.3342.33 59.5760.32 51.0747.32 35.8932.61 38.8231.57 40.8447.22 49.5946.22 46.6733.33 0.000.00 39.7335.62
LightGen 53.2243.41 66.5847.91 55.8347.33 74.8245.82 69.0750.57 46.7441.53 62.4440.82 61.7150.47 50.3445.34 53.3353.33 0.006.83 50.9250.55
Show-o 59.7258.86 73.0875.83 74.8379.83 78.8278.32 65.5769.32 53.6750.38 60.9556.82 68.5968.96 66.4656.22 63.3366.67 3.832.83 55.0250.92
Infinity 62.0762.32 73.0875.41 74.3376.83 72.8277.57 72.0771.82 56.6454.98 60.4455.57 74.2264.71 60.2259.71 80.0073.33 10.8323.83 54.2856.89
JanusPro 66.5065.02 79.3378.25 79.3382.33 78.3273.32 80.3279.07 59.7158.82 66.0756.20 70.4670.84 67.2259.97 60.0070.00 28.8333.83 65.8460.25
Closed-Source Models (GPT-4o Evaluation)
DALL-E 3 74.9670.81 78.7278.50 79.5079.83 80.8278.82 75.8276.82 73.3967.27 73.4567.20 72.0171.34 63.5960.72 89.6686.67 66.8354.83 72.9360.99
MidJourney v6 70.7867.70 76.0069.08 77.8369.33 81.3273.07 68.8264.82 68.5467.62 57.8261.95 69.8463.96 57.4660.34 83.3373.33 75.8373.83 65.1068.46
MidJourney v7 68.7465.69 77.4176.00 77.5881.83 82.0776.82 72.5769.32 64.6660.53 67.2062.70 81.2271.59 60.7264.59 83.3380.00 24.8320.83 68.8363.61
FLUX.1 Pro 67.3269.89 79.0878.91 78.8381.33 82.8283.82 75.5771.57 61.1065.37 62.3265.57 69.8471.47 65.9667.72 63.0063.00 35.8355.83 71.8068.80
GPT-4o 89.1588.29 90.7589.66 91.3387.08 84.5784.57 96.3297.32 88.5588.35 87.0789.44 87.2283.96 85.5983.21 90.0093.33 89.8386.83 89.7393.46