testperf

testperf:  run all performance tests

------------------installing ssget:

------------------installing GraphBLAS/Demo/MATLAB:

------------------installing spok:

------------------installing SSMULT:

------------------installing CXSparse:

------------------installing CHOLMOD:

performance of GxB_select

Problem: m 200 n 1 nnz 1
tril:
k:       -200 GB:   0.000027    0.000011 MATLAB:   0.000353 nnz:          0 speedup 12.94  33.54 
k:       -100 GB:   0.000009    0.000007 MATLAB:   0.000025 nnz:          1 speedup  2.78   3.48 
k:        -50 GB:   0.000007    0.000007 MATLAB:   0.000012 nnz:          1 speedup  1.59   1.63 
k:         -1 GB:   0.000007    0.000007 MATLAB:   0.000004 nnz:          1 speedup  0.58   0.62 
k:          0 GB:   0.000007    0.000007 MATLAB:   0.000004 nnz:          1 speedup  0.61   0.63 
k:          1 GB:   0.000007    0.000007 MATLAB:   0.000014 nnz:          1 speedup  2.00   2.07 
k:         50 GB:   0.000007    0.000015 MATLAB:   0.000069 nnz:          1 speedup  9.56   4.67 
k:          0 GB:   0.000024    0.000022 MATLAB:   0.000037 nnz:          1 speedup  1.50   1.65 
k:          1 GB:   0.000021    0.000021 MATLAB:   0.000011 nnz:          1 speedup  0.52   0.52 
triu:
k:       -200 GB:   0.000019    0.000014 MATLAB:   0.000040 nnz:          1 speedup  2.06   2.84 
k:       -100 GB:   0.000009    0.000008 MATLAB:   0.000012 nnz:          0 speedup  1.34   1.52 
k:        -50 GB:   0.000007    0.000007 MATLAB:   0.000004 nnz:          0 speedup  0.50   0.50 
k:         -1 GB:   0.000007    0.000007 MATLAB:   0.000003 nnz:          0 speedup  0.47   0.48 
k:          0 GB:   0.000007    0.000007 MATLAB:   0.000003 nnz:          0 speedup  0.48   0.49 
k:          1 GB:   0.000007    0.000007 MATLAB:   0.000013 nnz:          0 speedup  1.72   1.76 
k:         50 GB:   0.000007    0.000008 MATLAB:   0.000014 nnz:          0 speedup  1.93   1.76 
k:          0 GB:   0.000009    0.000008 MATLAB:   0.000005 nnz:          0 speedup  0.52   0.57 
k:          1 GB:   0.000008    0.000007 MATLAB:   0.000004 nnz:          0 speedup  0.50   0.55 
diag:
k:       -200 GB:   0.000010    0.000010 
k:       -100 GB:   0.000009    0.000008 
k:        -50 GB:   0.000011    0.000010 
k:         -1 GB:   0.000010    0.000010 
k:          0 GB:   0.000011    0.000010 
k:          1 GB:   0.000012    0.000010 
k:         50 GB:   0.000012    0.000011 
k:          0 GB:   0.000011    0.000010 
k:          1 GB:   0.000011    0.000010 
offdiag:
k:       -200 GB:   0.000011    0.000014 
k:       -100 GB:   0.000008    0.000008 
k:        -50 GB:   0.000008    0.000007 
k:         -1 GB:   0.000007    0.000007 
k:          0 GB:   0.000007    0.000007 
k:          1 GB:   0.000008    0.000007 
k:         50 GB:   0.000008    0.000007 
k:          0 GB:   0.000008    0.000007 
k:          1 GB:   0.000008    0.000007 
nonzero:
k:       -200 GB:   0.000008    0.000010 MATLAB:   0.000153 nnz:          1 speedup 19.27  16.03 
k:       -100 GB:   0.000008    0.000007 MATLAB:   0.000020 nnz:          1 speedup  2.63   2.79 
k:        -50 GB:   0.000007    0.000011 MATLAB:   0.000043 nnz:          1 speedup  5.77   3.78 
k:         -1 GB:   0.000010    0.000010 MATLAB:   0.000073 nnz:          1 speedup  7.56   7.56 
k:          0 GB:   0.000012    0.000007 MATLAB:   0.000020 nnz:          1 speedup  1.65   2.71 
k:          1 GB:   0.000011    0.000007 MATLAB:   0.000013 nnz:          1 speedup  1.10   1.71 
k:         50 GB:   0.000020    0.000007 MATLAB:   0.000012 nnz:          1 speedup  0.57   1.65 
k:          0 GB:   0.000031    0.000007 MATLAB:   0.000024 nnz:          1 speedup  0.77   3.47 
k:          1 GB:   0.000020    0.000007 MATLAB:   0.000027 nnz:          1 speedup  1.36   3.82 

Problem: m 200 n 100 nnz 1
tril:
k:       -200 GB:   0.000008    0.000010 MATLAB:   0.000008 nnz:          0 speedup  0.89   0.78 
k:       -100 GB:   0.000008    0.000008 MATLAB:   0.000006 nnz:          1 speedup  0.73   0.75 
k:        -50 GB:   0.000008    0.000008 MATLAB:   0.000006 nnz:          1 speedup  0.70   0.76 
k:         -1 GB:   0.000012    0.000009 MATLAB:   0.000006 nnz:          1 speedup  0.49   0.71 
k:          0 GB:   0.000012    0.000008 MATLAB:   0.000006 nnz:          1 speedup  0.50   0.69 
k:          1 GB:   0.000022    0.000009 MATLAB:   0.000007 nnz:          1 speedup  0.31   0.77 
k:         50 GB:   0.000021    0.000021 MATLAB:   0.000005 nnz:          1 speedup  0.21   0.22 
k:         50 GB:   0.000009    0.000009 MATLAB:   0.000006 nnz:          1 speedup  0.74   0.74 
k:        100 GB:   0.000017    0.000009 MATLAB:   0.000006 nnz:          1 speedup  0.35   0.70 
triu:
k:       -200 GB:   0.000009    0.000010 MATLAB:   0.000007 nnz:          1 speedup  0.81   0.71 
k:       -100 GB:   0.000012    0.000008 MATLAB:   0.000007 nnz:          1 speedup  0.54   0.83 
k:        -50 GB:   0.000017    0.000008 MATLAB:   0.000007 nnz:          0 speedup  0.38   0.81 
k:         -1 GB:   0.000010    0.000008 MATLAB:   0.000007 nnz:          0 speedup  0.65   0.82 
k:          0 GB:   0.000009    0.000008 MATLAB:   0.000006 nnz:          0 speedup  0.72   0.79 
k:          1 GB:   0.000021    0.000009 MATLAB:   0.000007 nnz:          0 speedup  0.32   0.78 
k:         50 GB:   0.000009    0.000008 MATLAB:   0.000007 nnz:          0 speedup  0.77   0.83 
k:         50 GB:   0.000008    0.000008 MATLAB:   0.000007 nnz:          0 speedup  0.84   0.84 
k:        100 GB:   0.000021    0.000008 MATLAB:   0.000007 nnz:          0 speedup  0.32   0.81 
diag:
k:       -200 GB:   0.000011    0.000010 MATLAB:   0.000917 nnz:          0 speedup 83.48  90.84 
k:       -100 GB:   0.000022    0.000008 MATLAB:   0.000763 nnz:          1 speedup 35.34  94.36 
k:        -50 GB:   0.000016    0.000009 MATLAB:   0.000174 nnz:          0 speedup 10.58  18.39 
k:         -1 GB:   0.000040    0.000022 MATLAB:   0.000101 nnz:          0 speedup  2.52   4.65 
k:          0 GB:   0.000010    0.000014 MATLAB:   0.000072 nnz:          0 speedup  7.56   5.32 
k:          1 GB:   0.000010    0.000014 MATLAB:   0.000094 nnz:          0 speedup  9.55   6.55 
k:         50 GB:   0.000014    0.000012 MATLAB:   0.000099 nnz:          0 speedup  7.10   8.13 
k:         50 GB:   0.000080    0.000027 MATLAB:   0.000179 nnz:          0 speedup  2.25   6.68 
k:        100 GB:   0.000037    0.000014 MATLAB:   0.000573 nnz:          0 speedup 15.36  39.56 
offdiag:
k:       -200 GB:   0.000048    0.000017 MATLAB:   0.000304 nnz:          1 speedup  6.34  18.36 
k:       -100 GB:   0.000067    0.000019 MATLAB:   0.000256 nnz:          0 speedup  3.84  13.37 
k:        -50 GB:   0.000018    0.000020 MATLAB:   0.000279 nnz:          1 speedup 15.55  13.79 
k:         -1 GB:   0.000015    0.000028 MATLAB:   0.000360 nnz:          1 speedup 23.68  12.98 
k:          0 GB:   0.000021    0.000017 MATLAB:   0.000174 nnz:          1 speedup  8.15  10.14 
k:          1 GB:   0.000011    0.000009 MATLAB:   0.000078 nnz:          1 speedup  7.12   8.81 
k:         50 GB:   0.000009    0.000008 MATLAB:   0.000062 nnz:          1 speedup  7.19   7.71 
k:         50 GB:   0.000013    0.000009 MATLAB:   0.000070 nnz:          1 speedup  5.38   7.51 
k:        100 GB:   0.000010    0.000011 MATLAB:   0.000127 nnz:          1 speedup 12.26  11.43 
nonzero:
k:       -200 GB:   0.000020    0.000031 MATLAB:   0.000381 nnz:          1 speedup 19.36  12.23 
k:       -100 GB:   0.000015    0.000013 MATLAB:   0.000039 nnz:          1 speedup  2.57   3.01 
k:        -50 GB:   0.000013    0.000011 MATLAB:   0.000014 nnz:          1 speedup  1.11   1.27 
k:         -1 GB:   0.000012    0.000011 MATLAB:   0.000013 nnz:          1 speedup  1.08   1.14 
k:          0 GB:   0.000011    0.000012 MATLAB:   0.000014 nnz:          1 speedup  1.22   1.15 
k:          1 GB:   0.000012    0.000009 MATLAB:   0.000214 nnz:          1 speedup 18.59  25.12 
k:         50 GB:   0.000009    0.000008 MATLAB:   0.000009 nnz:          1 speedup  1.04   1.12 
k:         50 GB:   0.000008    0.000008 MATLAB:   0.000009 nnz:          1 speedup  1.05   1.11 
k:        100 GB:   0.000008    0.000008 MATLAB:   0.000008 nnz:          1 speedup  1.02   1.06 

Problem: m 10 n 10 nnz 67
tril:
k:        -10 GB:   0.000009    0.000010 MATLAB:   0.000005 nnz:          0 speedup  0.53   0.52 
k:         -5 GB:   0.000008    0.000008 MATLAB:   0.000004 nnz:          8 speedup  0.51   0.52 
k:        -50 GB:   0.000008    0.000008 MATLAB:   0.000004 nnz:          0 speedup  0.48   0.50 
k:         -1 GB:   0.000008    0.000007 MATLAB:   0.000004 nnz:         27 speedup  0.50   0.51 
k:          0 GB:   0.000008    0.000007 MATLAB:   0.000004 nnz:         35 speedup  0.55   0.56 
k:          1 GB:   0.000008    0.000007 MATLAB:   0.000004 nnz:         42 speedup  0.50   0.52 
k:         50 GB:   0.000008    0.000007 MATLAB:   0.000003 nnz:         67 speedup  0.45   0.48 
k:          5 GB:   0.000008    0.000007 MATLAB:   0.000004 nnz:         60 speedup  0.49   0.51 
k:         10 GB:   0.000008    0.000007 MATLAB:   0.000004 nnz:         67 speedup  0.47   0.47 
triu:
k:        -10 GB:   0.000008    0.000010 MATLAB:   0.000004 nnz:         67 speedup  0.54   0.40 
k:         -5 GB:   0.000008    0.000008 MATLAB:   0.000004 nnz:         61 speedup  0.54   0.54 
k:        -50 GB:   0.000007    0.000008 MATLAB:   0.000004 nnz:         67 speedup  0.48   0.48 
k:         -1 GB:   0.000007    0.000008 MATLAB:   0.000004 nnz:         46 speedup  0.53   0.52 
k:          0 GB:   0.000008    0.000007 MATLAB:   0.000004 nnz:         40 speedup  0.50   0.51 
k:          1 GB:   0.000007    0.000007 MATLAB:   0.000004 nnz:         32 speedup  0.52   0.52 
k:         50 GB:   0.000008    0.000008 MATLAB:   0.000004 nnz:          0 speedup  0.44   0.45 
k:          5 GB:   0.000007    0.000007 MATLAB:   0.000004 nnz:         12 speedup  0.50   0.50 
k:         10 GB:   0.000009    0.000008 MATLAB:   0.000004 nnz:          0 speedup  0.43   0.45 
diag:
k:        -10 GB:   0.000008    0.000010 MATLAB:   0.000073 nnz:          0 speedup  9.47   7.46 
k:         -5 GB:   0.000015    0.000013 MATLAB:   0.000093 nnz:          2 speedup  6.35   7.06 
k:        -50 GB:   0.000022    0.000009 MATLAB:   0.000073 nnz:          0 speedup  3.28   8.45 
k:         -1 GB:   0.000026    0.000008 MATLAB:   0.000074 nnz:          5 speedup  2.87   9.64 
k:          0 GB:   0.000020    0.000007 MATLAB:   0.000069 nnz:          8 speedup  3.38   9.24 
k:          1 GB:   0.000021    0.000007 MATLAB:   0.000068 nnz:          6 speedup  3.24   9.09 
k:         50 GB:   0.000018    0.000008 MATLAB:   0.000057 nnz:          0 speedup  3.17   7.39 
k:          5 GB:   0.000018    0.000007 MATLAB:   0.000067 nnz:          5 speedup  3.84   9.23 
k:         10 GB:   0.000020    0.000008 MATLAB:   0.000057 nnz:          0 speedup  2.82   7.49 
offdiag:
k:        -10 GB:   0.000011    0.000010 MATLAB:   0.000064 nnz:         67 speedup  5.73   6.41 
k:         -5 GB:   0.000018    0.000008 MATLAB:   0.000072 nnz:         65 speedup  3.94   9.59 
k:        -50 GB:   0.000016    0.000008 MATLAB:   0.000063 nnz:         67 speedup  3.80   8.21 
k:         -1 GB:   0.000011    0.000008 MATLAB:   0.000070 nnz:         62 speedup  6.11   8.96 
k:          0 GB:   0.000011    0.000008 MATLAB:   0.000073 nnz:         59 speedup  6.53   8.57 
k:          1 GB:   0.000009    0.000008 MATLAB:   0.000076 nnz:         61 speedup  8.59   9.64 
k:         50 GB:   0.000021    0.000008 MATLAB:   0.000065 nnz:         67 speedup  3.10   8.61 
k:          5 GB:   0.000018    0.000008 MATLAB:   0.000072 nnz:         62 speedup  4.06   9.38 
k:         10 GB:   0.000016    0.000008 MATLAB:   0.000064 nnz:         67 speedup  3.89   8.45 
nonzero:
k:        -10 GB:   0.000009    0.000010 MATLAB:   0.000018 nnz:         53 speedup  2.00   1.80 
k:         -5 GB:   0.000012    0.000008 MATLAB:   0.000013 nnz:         53 speedup  1.03   1.54 
k:        -50 GB:   0.000020    0.000008 MATLAB:   0.000013 nnz:         53 speedup  0.62   1.57 
k:         -1 GB:   0.000013    0.000009 MATLAB:   0.000011 nnz:         53 speedup  0.84   1.22 
k:          0 GB:   0.000012    0.000008 MATLAB:   0.000014 nnz:         53 speedup  1.15   1.72 
k:          1 GB:   0.000011    0.000008 MATLAB:   0.000013 nnz:         53 speedup  1.16   1.64 
k:         50 GB:   0.000010    0.000008 MATLAB:   0.000013 nnz:         53 speedup  1.36   1.66 
k:          5 GB:   0.000009    0.000008 MATLAB:   0.000012 nnz:         53 speedup  1.33   1.60 
k:         10 GB:   0.000021    0.000008 MATLAB:   0.000012 nnz:         53 speedup  0.59   1.58 

Problem: m 4000 n 4000 nnz 16000000
tril:
k:      -4000 GB:   0.019004    0.000559 MATLAB:   0.136092 nnz:          0 speedup  7.16  243.34 
k:      -2000 GB:   0.032655    0.049088 MATLAB:   0.224150 nnz:    2001000 speedup  6.86   4.57 
k:        -50 GB:   0.168099    0.104853 MATLAB:   0.070356 nnz:    7803225 speedup  0.42   0.67 
k:         -1 GB:   0.126815    0.060912 MATLAB:   0.071143 nnz:    7998000 speedup  0.56   1.17 
k:          0 GB:   0.077032    0.058670 MATLAB:   0.070427 nnz:    8002000 speedup  0.91   1.20 
k:          1 GB:   0.085323    0.059273 MATLAB:   0.073833 nnz:    8005999 speedup  0.87   1.25 
k:         50 GB:   0.075619    0.059865 MATLAB:   0.070715 nnz:    8200725 speedup  0.94   1.18 
k:       2000 GB:   0.218681    0.142578 MATLAB:   0.074498 nnz:   14001000 speedup  0.34   0.52 
k:       4000 GB:   0.151441    0.133232 MATLAB:   0.078823 nnz:   16000000 speedup  0.52   0.59 
triu:
k:      -4000 GB:   0.135011    0.115383 MATLAB:   0.079560 nnz:   16000000 speedup  0.59   0.69 
k:      -2000 GB:   0.120473    0.105143 MATLAB:   0.073081 nnz:   14001000 speedup  0.61   0.70 
k:        -50 GB:   0.079075    0.069440 MATLAB:   0.065683 nnz:    8200725 speedup  0.83   0.95 
k:         -1 GB:   0.074860    0.065773 MATLAB:   0.066377 nnz:    8005999 speedup  0.89   1.01 
k:          0 GB:   0.086344    0.067531 MATLAB:   0.071620 nnz:    8002000 speedup  0.83   1.06 
k:          1 GB:   0.085912    0.064282 MATLAB:   0.068114 nnz:    7998000 speedup  0.79   1.06 
k:         50 GB:   0.074999    0.061799 MATLAB:   0.062728 nnz:    7803225 speedup  0.84   1.02 
k:       2000 GB:   0.031114    0.028643 MATLAB:   0.052624 nnz:    2001000 speedup  1.69   1.84 
k:       4000 GB:   0.045157    0.036792 MATLAB:   0.043805 nnz:          0 speedup  0.97   1.19 
diag:
k:      -4000 GB:   0.000034    0.000021 MATLAB:   0.000108 nnz:          0 speedup  3.17   5.08 
k:      -2000 GB:   0.000098    0.000091 MATLAB:   0.005916 nnz:       2000 speedup 60.52  65.31 
k:        -50 GB:   0.000167    0.000163 MATLAB:   0.024176 nnz:       3950 speedup 145.06  148.17 
k:         -1 GB:   0.000242    0.000169 MATLAB:   0.015357 nnz:       3999 speedup 63.43  90.77 
k:          0 GB:   0.000173    0.000166 MATLAB:   0.011841 nnz:       4000 speedup 68.43  71.27 
k:          1 GB:   0.000256    0.000315 MATLAB:   0.030405 nnz:       3999 speedup 118.64  96.40 
k:         50 GB:   0.000277    0.000166 MATLAB:   0.016765 nnz:       3950 speedup 60.45  101.08 
k:       2000 GB:   0.000094    0.000109 MATLAB:   0.006027 nnz:       2000 speedup 63.80  55.45 
k:       4000 GB:   0.000018    0.000025 MATLAB:   0.000075 nnz:          0 speedup  4.11   3.04 
offdiag:
k:      -4000 GB:   0.147047    0.121726 MATLAB:   0.080275 nnz:   16000000 speedup  0.55   0.66 
k:      -2000 GB:   0.137266    0.118121 MATLAB:   0.086155 nnz:   15998000 speedup  0.63   0.73 
k:        -50 GB:   0.139697    0.116283 MATLAB:   0.091324 nnz:   15996050 speedup  0.65   0.79 
k:         -1 GB:   0.143571    0.114159 MATLAB:   0.088226 nnz:   15996001 speedup  0.61   0.77 
k:          0 GB:   0.137916    0.114658 MATLAB:   0.088427 nnz:   15996000 speedup  0.64   0.77 
k:          1 GB:   0.135017    0.122774 MATLAB:   0.086619 nnz:   15996001 speedup  0.64   0.71 
k:         50 GB:   0.140782    0.121212 MATLAB:   0.089940 nnz:   15996050 speedup  0.64   0.74 
k:       2000 GB:   0.141235    0.116250 MATLAB:   0.085668 nnz:   15998000 speedup  0.61   0.74 
k:       4000 GB:   0.135528    0.120902 MATLAB:   0.075800 nnz:   16000000 speedup  0.56   0.63 
nonzero:
k:      -4000 GB:   0.143567    0.127001 MATLAB:   0.157127 nnz:   16000000 speedup  1.09   1.24 
k:      -2000 GB:   0.136996    0.124006 MATLAB:   0.163632 nnz:   16000000 speedup  1.19   1.32 
k:        -50 GB:   0.144736    0.118679 MATLAB:   0.162722 nnz:   16000000 speedup  1.12   1.37 
k:         -1 GB:   0.138620    0.123428 MATLAB:   0.151803 nnz:   16000000 speedup  1.10   1.23 
k:          0 GB:   0.139867    0.119528 MATLAB:   0.151462 nnz:   16000000 speedup  1.08   1.27 
k:          1 GB:   0.140658    0.121467 MATLAB:   0.156291 nnz:   16000000 speedup  1.11   1.29 
k:         50 GB:   0.155443    0.119958 MATLAB:   0.158941 nnz:   16000000 speedup  1.02   1.32 
k:       2000 GB:   0.137930    0.116017 MATLAB:   0.154221 nnz:   16000000 speedup  1.12   1.33 
k:       4000 GB:   0.141128    0.118777 MATLAB:   0.161842 nnz:   16000000 speedup  1.15   1.36 

Problem: m 2999349 n 2999349 nnz 14313235
tril:
k:   -2999349 GB:   0.091750    0.058960 MATLAB:   0.102913 nnz:          0 speedup  1.12   1.75 
k:   -1499674 GB:   0.059681    0.044972 MATLAB:   0.104177 nnz:     899614 speedup  1.75   2.32 
k:        -50 GB:   0.105190    0.082922 MATLAB:   0.106082 nnz:    3851719 speedup  1.01   1.28 
k:         -1 GB:   0.115125    0.091077 MATLAB:   0.102741 nnz:    5664550 speedup  0.89   1.13 
k:          0 GB:   0.141014    0.112479 MATLAB:   0.104577 nnz:    8663671 speedup  0.74   0.93 
k:          1 GB:   0.142190    0.120073 MATLAB:   0.104486 nnz:    9621636 speedup  0.73   0.87 
k:         50 GB:   0.142795    0.126242 MATLAB:   0.105655 nnz:   10486596 speedup  0.74   0.84 
k:    1499674 GB:   0.166812    0.143041 MATLAB:   0.101897 nnz:   13508993 speedup  0.61   0.71 
k:    2999349 GB:   0.171913    0.147937 MATLAB:   0.103465 nnz:   14313235 speedup  0.60   0.70 
triu:
k:   -2999349 GB:   0.158035    0.134138 MATLAB:   0.105997 nnz:   14313235 speedup  0.67   0.79 
k:   -1499674 GB:   0.149358    0.124609 MATLAB:   0.115654 nnz:   13413623 speedup  0.77   0.93 
k:        -50 GB:   0.135307    0.110524 MATLAB:   0.109566 nnz:   10464670 speedup  0.81   0.99 
k:         -1 GB:   0.128445    0.112594 MATLAB:   0.105056 nnz:    9606642 speedup  0.82   0.93 
k:          0 GB:   0.124149    0.100122 MATLAB:   0.114688 nnz:    8648685 speedup  0.92   1.15 
k:          1 GB:   0.100918    0.096610 MATLAB:   0.119924 nnz:    5649564 speedup  1.19   1.24 
k:         50 GB:   0.114835    0.066980 MATLAB:   0.100269 nnz:    3830017 speedup  0.87   1.50 
k:    1499674 GB:   0.072109    0.041898 MATLAB:   0.098842 nnz:     804244 speedup  1.37   2.36 
k:    2999349 GB:   0.072135    0.056163 MATLAB:   0.078091 nnz:          0 speedup  1.08   1.39 
diag:
k:   -2999349 GB:   0.017990    0.021492 MATLAB:   0.024092 nnz:          0 speedup  1.34   1.12 
k:   -1499674 GB:   0.036857    0.026975 MATLAB:   0.243927 nnz:          2 speedup  6.62   9.04 
k:        -50 GB:   0.051331    0.039178 MATLAB:   0.354145 nnz:       3154 speedup  6.90   9.04 
k:         -1 GB:   0.070177    0.048232 MATLAB:   0.374058 nnz:     957957 speedup  5.33   7.76 
k:          0 GB:   0.072932    0.060413 MATLAB:   0.359694 nnz:    2999121 speedup  4.93   5.95 
k:          1 GB:   0.066078    0.051260 MATLAB:   0.359822 nnz:     957965 speedup  5.45   7.02 
k:         50 GB:   0.058256    0.041649 MATLAB:   0.343443 nnz:       3378 speedup  5.90   8.25 
k:    1499674 GB:   0.043307    0.030416 MATLAB:   0.181268 nnz:          2 speedup  4.19   5.96 
k:    2999349 GB:   0.035211    0.020165 MATLAB:   0.022575 nnz:          0 speedup  0.64   1.12 
offdiag:
k:   -2999349 GB:   0.195592    0.251178 MATLAB:   0.218275 nnz:   14313235 speedup  1.12   0.87 
k:   -1499674 GB:   0.151967    0.133627 MATLAB:   0.315315 nnz:   14313233 speedup  2.07   2.36 
k:        -50 GB:   0.155706    0.128022 MATLAB:   0.609695 nnz:   14310081 speedup  3.92   4.76 
k:         -1 GB:   0.147758    0.123825 MATLAB:   0.550531 nnz:   13355278 speedup  3.73   4.45 
k:          0 GB:   0.150859    0.117252 MATLAB:   0.546293 nnz:   11314114 speedup  3.62   4.66 
k:          1 GB:   0.148934    0.131725 MATLAB:   0.480690 nnz:   13355270 speedup  3.23   3.65 
k:         50 GB:   0.154270    0.134581 MATLAB:   0.436402 nnz:   14309857 speedup  2.83   3.24 
k:    1499674 GB:   0.149646    0.132924 MATLAB:   0.307799 nnz:   14313233 speedup  2.06   2.32 
k:    2999349 GB:   0.153131    0.131432 MATLAB:   0.109092 nnz:   14313235 speedup  0.71   0.83 
nonzero:
k:   -2999349 GB:   0.147979    0.127316 MATLAB:   0.282673 nnz:   14313235 speedup  1.91   2.22 
k:   -1499674 GB:   0.150699    0.133361 MATLAB:   0.289767 nnz:   14313235 speedup  1.92   2.17 
k:        -50 GB:   0.152460    0.140019 MATLAB:   0.280819 nnz:   14313235 speedup  1.84   2.01 
k:         -1 GB:   0.154316    0.150415 MATLAB:   0.314513 nnz:   14313235 speedup  2.04   2.09 
k:          0 GB:   0.157078    0.133085 MATLAB:   0.288664 nnz:   14313235 speedup  1.84   2.17 
k:          1 GB:   0.151281    0.132076 MATLAB:   0.293029 nnz:   14313235 speedup  1.94   2.22 
k:         50 GB:   0.160320    0.131603 MATLAB:   0.279025 nnz:   14313235 speedup  1.74   2.12 
k:    1499674 GB:   0.154909    0.136692 MATLAB:   0.312408 nnz:   14313235 speedup  2.02   2.29 
k:    2999349 GB:   0.153957    0.139686 MATLAB:   0.284084 nnz:   14313235 speedup  1.85   2.03 

test36 --------------------- performance of GB_Matrix_subref
-------------------------- column vector (100000000-by-1):
 V(:,1): nnz (V) = 994979
MATLAB 0.002828 GrB: 0.012000  speedup 0.235706
 V(50e6:80e6,1) explicit list:
MATLAB 0.872803 GrB: 0.071085  speedup 12.2782
 V(50e6:80e6,1) colon:
MATLAB 0.586064 GrB: 0.001051  speedup 557.842
 V(50044316,1):
MATLAB 0.000076 GrB: 0.000255  speedup 0.298539
MATLAB 0.000095 GrB: 0.000135  speedup 0.703824
 V( 100 entries ,1):
MATLAB 0.000195 GrB: 0.000235  speedup 0.82921
 V( 100 entries ,1:4):
MATLAB 0.000483 GrB: 0.000700  speedup 0.690172
many single entries:
MATLAB 0.215190 GrB: 4.768942  speedup 0.0451232

test36: all tests passed
start GraphBLAS:
Elapsed time is 0.534108 seconds.

t =

    0.3590

start MATLAB:

tm =

   12.2573

GraphBLAS speedup over MATLAB: 34.1442

test30: all tests passed
start GraphBLAS:
Elapsed time is 0.412640 seconds.

t =

    0.3278

start MATLAB:

tm =

   12.4170

GraphBLAS speedup over MATLAB: 37.8804

test30b: all tests passed

 ---------------------- quick test of GrB_extractTuples
Elapsed time is 0.194098 seconds.

t =

    0.1715

GraphBLAS speedup over MATLAB; 0.654358

test35: all tests passed

test39 performance tests : GrB_transpose 

Prob = 

  struct with fields:

         A: [7200072000 double]
      name: 'ND/nd24k'
     title: 'ND problem set, matrix nd24k'
        id: 939
      date: '2003'
    author: 'author unknown'
        ed: 'T. Davis'
      kind: '2D/3D problem'


===============================================================n
C = Cin + A'
Elapsed time is 1.215878 seconds.
Elapsed time is 1.317950 seconds.
GraphBLAS time: 1.30995
speedup over MATLAB: 0.928272

===============================================================n
GraphBLAS: C = (single) A' compared with C=A' in MATLAB

Cin = 

  struct with fields:

    matrix: [7200072000 double]
     class: 'single'

Elapsed time is 0.670745 seconds.
Elapsed time is 0.647498 seconds.
GraphBLAS time: 0.512292
speedup over MATLAB: 1.30953

===============================================================n
C = A + B
Elapsed time is 0.356208 seconds.
Elapsed time is 0.761434 seconds.
GraphBLAS time: 0.438537
speedup over MATLAB: 0.812549
Elapsed time is 0.412696 seconds.
GraphBLAS time: 0.344739
speedup over MATLAB: 1.03363
Elapsed time is 0.331514 seconds.
GraphBLAS time: 0.330915
speedup over MATLAB: 1.07681
Elapsed time is 0.810290 seconds.
GraphBLAS time: 0.808599
speedup over MATLAB: 0.44068
Elapsed time is 0.702649 seconds.
GraphBLAS time: 0.661565
speedup over MATLAB: 0.538622

===============================================================n
C = Cin + A + B
Elapsed time is 1.035151 seconds.
Elapsed time is 0.857903 seconds.
Elapsed time is 0.747208 seconds.
Elapsed time is 0.397756 seconds.
Elapsed time is 0.804760 seconds.
GraphBLAS time: 0.755004
speedup over MATLAB: 1.3713
Elapsed time is 0.797446 seconds.
GraphBLAS time: 0.725707
speedup over MATLAB: 1.18242
Elapsed time is 0.764953 seconds.
GraphBLAS time: 0.696907
speedup over MATLAB: 1.07238
Elapsed time is 0.414225 seconds.
GraphBLAS time: 0.37315
speedup over MATLAB: 1.06646

test39: all tests passed

----------------------- performance tests for GrB_Matrix_build

Prob = 

  struct with fields:

     title: 'U CAVETT PROBLEM WITH 5 COMPONENTS ( CHEM. ENG. FROM WESTERBERG )'
         A: [6767 double]
      name: 'HB/west0067'
        id: 262
      date: '1983'
    author: 'A. Westerberg'
        ed: 'I. Duff, R. Grimes, J. Lewis'
      kind: 'chemical process simulation problem'

matrix from collection, no sorting:

Prob = 

  struct with fields:

         A: [7200072000 double]
      name: 'ND/nd24k'
     title: 'ND problem set, matrix nd24k'
        id: 939
      date: '2003'
    author: 'author unknown'
        ed: 'T. Davis'
      kind: '2D/3D problem'

MATLAB:
Elapsed time is 1.265169 seconds.
GrB:
Elapsed time is 0.714781 seconds.
Csparse:
Elapsed time is 1.885907 seconds.
sparse2:
Elapsed time is 1.873499 seconds.

random matrix, with duplicates:
MATLAB:
Elapsed time is 2.967140 seconds.
GrB:
Elapsed time is 2.733130 seconds.
Csparse:
Elapsed time is 2.352294 seconds.
sparse2:
Elapsed time is 2.290093 seconds.

same random matrix, but presorted:
MATLAB:
Elapsed time is 1.480245 seconds.
GrB:
Elapsed time is 0.756398 seconds.
CSparse:
Elapsed time is 2.675076 seconds.
sparse2:
Elapsed time is 2.417413 seconds.

test42: all tests passed

------------------------------ testing GB_mex_Matrix_subref
-------------------------- problem:
-------------------------- case 5, ni large, qsort, no dupl:
MATLAB 1.07552 GrB 1.77434 CSparse 1.56644
-------------------------- case 5, ni large, qsort, no dupl:
Elapsed time is 0.715069 seconds.
Elapsed time is 1.077238 seconds.
-------------------------- contig:
length (I), 36000 min 1 max 36000
Elapsed time is 0.245645 seconds.
Elapsed time is 0.348605 seconds.
-------------------------- case 5, ni large, qsort, with dupl:
Elapsed time is 1.836353 seconds.
Elapsed time is 2.517910 seconds.
double transpose time in MATLAB:
Elapsed time is 4.210779 seconds.
-------------------------- first rows:
   1 rows: C0:       414     72000  MATLAB: 0.055372  speedup   0.86
   2 rows: C0:       824    144000  MATLAB: 0.014480  speedup   0.74
   3 rows: C0:      1237    216000  MATLAB: 0.019744  speedup   0.51
   4 rows: C0:      1669    288000  MATLAB: 0.025477  speedup   0.45
   5 rows: C0:      2081    360000  MATLAB: 0.042826  speedup   1.98
   6 rows: C0:      2510    432000  MATLAB: 0.033416  speedup   1.24
   7 rows: C0:      2918    504000  MATLAB: 0.033433  speedup   1.72
   8 rows: C0:      3321    576000  MATLAB: 0.039468  speedup   2.04
   9 rows: C0:      3739    648000  MATLAB: 0.036637  speedup   1.73
  10 rows: C0:      4168    720000  MATLAB: 0.038911  speedup   1.33
  11 rows: C0:      4592    792000  MATLAB: 0.034118  speedup   1.25
  12 rows: C0:      5018    864000  MATLAB: 0.037528  speedup   1.59
  13 rows: C0:      5480    936000  MATLAB: 0.042784  speedup   1.55
  14 rows: C0:      5939   1008000  MATLAB: 0.045026  speedup   1.65
  15 rows: C0:      6395   1080000  MATLAB: 0.041836  speedup   1.59
  16 rows: C0:      6850   1152000  MATLAB: 0.040860  speedup   1.49
  17 rows: C0:      7309   1224000  MATLAB: 0.041326  speedup   1.54
  18 rows: C0:      7767   1296000  MATLAB: 0.049269  speedup   2.18
  19 rows: C0:      8223   1368000  MATLAB: 0.044908  speedup   2.41
  20 rows: C0:      8672   1440000  MATLAB: 0.052208  speedup   2.65
  21 rows: C0:      9121   1512000  MATLAB: 0.052327  speedup   1.92
  22 rows: C0:      9582   1584000  MATLAB: 0.049434  speedup   1.87
  23 rows: C0:     10033   1656000  MATLAB: 0.056564  speedup   2.98
  24 rows: C0:     10486   1728000  MATLAB: 0.047548  speedup   2.14
  25 rows: C0:     10950   1800000  MATLAB: 0.052277  speedup   1.95
  26 rows: C0:     11400   1872000  MATLAB: 0.061797  speedup   3.18
  27 rows: C0:     11856   1944000  MATLAB: 0.059576  speedup   2.09
  28 rows: C0:     12300   2016000  MATLAB: 0.056189  speedup   2.84
  29 rows: C0:     12734   2088000  MATLAB: 0.050761  speedup   2.34
  30 rows: C0:     13171   2160000  MATLAB: 0.061883  speedup   2.41
  31 rows: C0:     13633   2232000  MATLAB: 0.059294  speedup   3.10
  32 rows: C0:     14087   2304000  MATLAB: 0.054493  speedup   2.07
  33 rows: C0:     14532   2376000  MATLAB: 0.063636  speedup   3.20
  34 rows: C0:     14970   2448000  MATLAB: 0.060000  speedup   2.16
  35 rows: C0:     15401   2520000  MATLAB: 0.069119  speedup   3.54
  36 rows: C0:     15829   2592000  MATLAB: 0.067486  speedup   3.40
  37 rows: C0:     16241   2664000  MATLAB: 0.063122  speedup   2.36
  38 rows: C0:     16657   2736000  MATLAB: 0.070393  speedup   3.57
  39 rows: C0:     17075   2808000  MATLAB: 0.070077  speedup   3.59
  40 rows: C0:     17499   2880000  MATLAB: 0.073831  speedup   3.93
  41 rows: C0:     17924   2952000  MATLAB: 0.073441  speedup   3.33
  42 rows: C0:     18350   3024000  MATLAB: 0.068549  speedup   2.51
  43 rows: C0:     18727   3096000  MATLAB: 0.078859  speedup   3.99
  44 rows: C0:     19110   3168000  MATLAB: 0.069940  speedup   2.65
  45 rows: C0:     19497     19497  MATLAB: 0.084619  speedup   4.27
  46 rows: C0:     19864     19864  MATLAB: 0.071373  speedup   2.67
  47 rows: C0:     20242     20242  MATLAB: 0.081091  speedup   4.26
  48 rows: C0:     20620     20620  MATLAB: 0.069092  speedup   2.37
  49 rows: C0:     20976     20976  MATLAB: 0.093863  speedup   4.81
  50 rows: C0:     21345     21345  MATLAB: 0.082370  speedup   4.21
  51 rows: C0:     21701     21701  MATLAB: 0.078285  speedup   3.97
-------------------------- last row:
C0:       345     72000  MATLAB: 0.006558  speedup   2.44
-------------------------- contig, sorted:
Elapsed time is 0.196236 seconds.
Elapsed time is 0.488724 seconds.
-------------------------- contig lower half, sorted:
length (I), 36000 min 36000 max 71999
Elapsed time is 0.139601 seconds.
Elapsed time is 0.167744 seconds.
-------------------------- one row:
MATLAB: 0.040078  speedup   1.84
MATLAB: 0.015389  speedup   2.83
MATLAB: 0.014828  speedup   1.31
MATLAB: 0.010109  speedup   0.73
MATLAB: 0.008095  speedup   0.41
MATLAB: 0.008285  speedup   0.63
MATLAB: 0.015101  speedup   0.73
MATLAB: 0.009158  speedup   0.64
MATLAB: 0.022807  speedup   1.34
MATLAB: 0.008191  speedup   0.62
MATLAB: 0.010455  speedup   0.33
MATLAB: 0.008200  speedup   0.56
MATLAB: 0.016591  speedup   0.68
MATLAB: 0.008507  speedup   0.51
MATLAB: 0.023646  speedup   1.14
MATLAB: 0.008477  speedup   0.52
MATLAB: 0.022558  speedup   1.05
MATLAB: 0.008954  speedup   0.43
MATLAB: 0.019819  speedup   0.99
MATLAB: 0.009147  speedup   0.30
MATLAB: 0.011485  speedup   0.59
MATLAB: 0.008643  speedup   0.24
MATLAB: 0.016231  speedup   0.82
MATLAB: 0.014225  speedup   0.56
MATLAB: 0.018478  speedup   0.79
MATLAB: 0.010061  speedup   0.32
MATLAB: 0.010840  speedup   0.52
MATLAB: 0.025487  speedup   1.15
MATLAB: 0.008583  speedup   0.28
MATLAB: 0.012277  speedup   0.57
MATLAB: 0.020952  speedup   0.72
MATLAB: 0.013824  speedup   0.48
MATLAB: 0.010970  speedup   0.51
MATLAB: 0.018734  speedup   0.67
MATLAB: 0.016639  speedup   0.55
MATLAB: 0.009461  speedup   0.30
MATLAB: 0.012078  speedup   0.57
MATLAB: 0.009736  speedup   0.27
MATLAB: 0.016790  speedup   0.86
MATLAB: 0.013299  speedup   0.47
MATLAB: 0.016867  speedup   0.68
MATLAB: 0.009505  speedup   0.21
MATLAB: 0.014147  speedup   0.52
MATLAB: 0.020618  speedup   0.75
MATLAB: 0.014782  speedup   0.72
MATLAB: 0.013351  speedup   0.62
MATLAB: 0.008524  speedup   0.25
MATLAB: 0.008448  speedup   0.27
MATLAB: 0.014161  speedup   0.60
MATLAB: 0.024665  speedup   1.17
MATLAB: 0.008467  speedup   0.30
MATLAB: 0.012413  speedup   0.59
MATLAB: 0.008322  speedup   0.22
MATLAB: 0.008371  speedup   0.46
MATLAB: 0.023368  speedup   1.06
MATLAB: 0.008633  speedup   0.41
MATLAB: 0.018706  speedup   0.92
MATLAB: 0.008040  speedup   0.29
MATLAB: 0.013353  speedup   0.69
MATLAB: 0.007938  speedup   0.22
MATLAB: 0.013950  speedup   0.71
MATLAB: 0.018567  speedup   0.94
MATLAB: 0.021781  speedup   1.12
MATLAB: 0.008014  speedup   0.47
MATLAB: 0.022358  speedup   1.15
MATLAB: 0.008082  speedup   0.42
MATLAB: 0.022655  speedup   1.44
MATLAB: 0.008791  speedup   0.47
MATLAB: 0.021817  speedup   1.43
MATLAB: 0.007904  speedup   0.74
MATLAB: 0.022199  speedup   2.26
MATLAB: 0.008953  speedup   1.65
-------------------------- two rows:
MATLAB: 0.013527  speedup   0.56
MATLAB: 0.013944  speedup   2.55
MATLAB: 0.012876  speedup   0.59
MATLAB: 0.015323  speedup   1.40
MATLAB: 0.029222  speedup   2.23
MATLAB: 0.013982  speedup   0.76
MATLAB: 0.021462  speedup   1.52
MATLAB: 0.012654  speedup   0.63
MATLAB: 0.014176  speedup   0.88
MATLAB: 0.021198  speedup   0.87
MATLAB: 0.015167  speedup   0.78
MATLAB: 0.017485  speedup   0.94
MATLAB: 0.030140  speedup   1.34
MATLAB: 0.016589  speedup   0.63
MATLAB: 0.016239  speedup   0.95
MATLAB: 0.032468  speedup   1.21
MATLAB: 0.013505  speedup   0.74
MATLAB: 0.023487  speedup   1.06
MATLAB: 0.020617  speedup   0.69
MATLAB: 0.016096  speedup   0.63
MATLAB: 0.023483  speedup   1.11
MATLAB: 0.021920  speedup   0.86
MATLAB: 0.016424  speedup   0.53
MATLAB: 0.016122  speedup   0.85
MATLAB: 0.034861  speedup   1.71
MATLAB: 0.013486  speedup   0.40
MATLAB: 0.016358  speedup   0.64
MATLAB: 0.023948  speedup   1.06
MATLAB: 0.019670  speedup   0.74
MATLAB: 0.016995  speedup   0.50
MATLAB: 0.015845  speedup   0.76
MATLAB: 0.028345  speedup   0.87
MATLAB: 0.015869  speedup   0.71
MATLAB: 0.023234  speedup   0.68
MATLAB: 0.016263  speedup   0.79
MATLAB: 0.028389  speedup   1.30
MATLAB: 0.013786  speedup   0.41
MATLAB: 0.016752  speedup   0.65
MATLAB: 0.024825  speedup   1.06
MATLAB: 0.021545  speedup   0.76
MATLAB: 0.016423  speedup   0.44
MATLAB: 0.018833  speedup   0.48
MATLAB: 0.017853  speedup   0.82
MATLAB: 0.032131  speedup   1.34
MATLAB: 0.016961  speedup   0.45
MATLAB: 0.016012  speedup   0.42
MATLAB: 0.018063  speedup   0.72
MATLAB: 0.028036  speedup   1.27
MATLAB: 0.021365  speedup   0.69
MATLAB: 0.016318  speedup   0.45
MATLAB: 0.016488  speedup   0.63
MATLAB: 0.016963  speedup   0.80
MATLAB: 0.015148  speedup   0.43
MATLAB: 0.015146  speedup   0.57
MATLAB: 0.019536  speedup   0.90
MATLAB: 0.028531  speedup   0.96
MATLAB: 0.013783  speedup   0.71
MATLAB: 0.018428  speedup   0.88
MATLAB: 0.015955  speedup   0.56
MATLAB: 0.016062  speedup   0.51
MATLAB: 0.016845  speedup   0.94
MATLAB: 0.030623  speedup   1.34
MATLAB: 0.015514  speedup   0.92
MATLAB: 0.019289  speedup   1.07
MATLAB: 0.012251  speedup   0.48
MATLAB: 0.018530  speedup   1.25
MATLAB: 0.025609  speedup   1.60
MATLAB: 0.018671  speedup   1.44
MATLAB: 0.017816  speedup   1.25
MATLAB: 0.019190  speedup   1.76
MATLAB: 0.020254  speedup   1.95
MATLAB: 0.014645  speedup   1.78
-------------------------- three contig rows:
MATLAB: 0.025867  speedup   1.21
MATLAB: 0.027007  speedup   4.91
MATLAB: 0.021011  speedup   2.53
MATLAB: 0.022588  speedup   2.81
MATLAB: 0.027965  speedup   2.06
MATLAB: 0.016302  speedup   1.42
MATLAB: 0.021527  speedup   1.46
MATLAB: 0.014340  speedup   1.02
MATLAB: 0.028004  speedup   1.73
MATLAB: 0.013437  speedup   0.55
MATLAB: 0.018012  speedup   1.00
MATLAB: 0.029980  speedup   1.19
MATLAB: 0.014454  speedup   0.91
MATLAB: 0.024617  speedup   1.27
MATLAB: 0.020279  speedup   0.71
MATLAB: 0.017121  speedup   0.67
MATLAB: 0.027069  speedup   1.45
MATLAB: 0.022346  speedup   0.81
MATLAB: 0.016990  speedup   0.48
MATLAB: 0.019270  speedup   0.58
MATLAB: 0.018154  speedup   0.67
MATLAB: 0.026594  speedup   0.70
MATLAB: 0.017083  speedup   0.55
MATLAB: 0.021078  speedup   1.04
MATLAB: 0.029220  speedup   1.02
MATLAB: 0.015385  speedup   0.70
MATLAB: 0.020725  speedup   1.06
MATLAB: 0.033302  speedup   1.19
MATLAB: 0.015969  speedup   0.74
MATLAB: 0.021368  speedup   0.59
MATLAB: 0.019222  speedup   0.49
MATLAB: 0.019891  speedup   0.58
MATLAB: 0.018596  speedup   0.53
MATLAB: 0.017546  speedup   0.83
MATLAB: 0.030090  speedup   1.29
MATLAB: 0.020188  speedup   0.44
MATLAB: 0.021128  speedup   0.54
MATLAB: 0.020848  speedup   0.62
MATLAB: 0.019005  speedup   0.51
MATLAB: 0.017933  speedup   0.88
MATLAB: 0.034367  speedup   1.67
MATLAB: 0.021516  speedup   0.70
MATLAB: 0.017703  speedup   0.47
MATLAB: 0.018084  speedup   0.56
MATLAB: 0.019449  speedup   0.94
MATLAB: 0.033006  speedup   0.94
MATLAB: 0.016937  speedup   0.72
MATLAB: 0.029159  speedup   1.38
MATLAB: 0.023540  speedup   0.98
MATLAB: 0.017998  speedup   0.64
MATLAB: 0.018652  speedup   0.81
MATLAB: 0.034499  speedup   1.73
MATLAB: 0.017291  speedup   0.58
MATLAB: 0.018410  speedup   0.49
MATLAB: 0.019384  speedup   0.81
MATLAB: 0.020737  speedup   1.04
MATLAB: 0.023945  speedup   0.62
MATLAB: 0.016344  speedup   0.85
MATLAB: 0.032461  speedup   1.49
MATLAB: 0.014491  speedup   0.50
MATLAB: 0.017973  speedup   0.98
MATLAB: 0.030145  speedup   1.80
MATLAB: 0.013635  speedup   0.46
MATLAB: 0.017360  speedup   1.10
MATLAB: 0.029529  speedup   1.15
MATLAB: 0.013680  speedup   0.93
MATLAB: 0.021722  speedup   1.43
MATLAB: 0.012450  speedup   0.60
MATLAB: 0.017359  speedup   1.19
MATLAB: 0.011896  speedup   0.46
MATLAB: 0.019570  speedup   1.86
MATLAB: 0.013235  speedup   1.16
-------------------------- four contig rows:
MATLAB: 0.017002  speedup   3.71
MATLAB: 0.018423  speedup   1.21
MATLAB: 0.018329  speedup   2.60
MATLAB: 0.034978  speedup   3.15
MATLAB: 0.016645  speedup   0.60
MATLAB: 0.017880  speedup   1.59
MATLAB: 0.031701  speedup   1.65
MATLAB: 0.016903  speedup   1.18
MATLAB: 0.028484  speedup   1.79
MATLAB: 0.021552  speedup   0.77
MATLAB: 0.018979  speedup   0.78
MATLAB: 0.022579  speedup   1.40
MATLAB: 0.024846  speedup   0.87
MATLAB: 0.022436  speedup   0.67
MATLAB: 0.019118  speedup   0.57
MATLAB: 0.020681  speedup   0.60
MATLAB: 0.020375  speedup   0.61
MATLAB: 0.021971  speedup   0.62
MATLAB: 0.018907  speedup   0.55
MATLAB: 0.020766  speedup   0.66
MATLAB: 0.023122  speedup   0.65
MATLAB: 0.018988  speedup   0.59
MATLAB: 0.019627  speedup   0.75
MATLAB: 0.024693  speedup   0.74
MATLAB: 0.019738  speedup   0.56
MATLAB: 0.020257  speedup   0.49
MATLAB: 0.025882  speedup   0.77
MATLAB: 0.024500  speedup   0.61
MATLAB: 0.027032  speedup   0.79
MATLAB: 0.020035  speedup   0.54
MATLAB: 0.021529  speedup   0.58
MATLAB: 0.022709  speedup   0.81
MATLAB: 0.022207  speedup   0.66
MATLAB: 0.027032  speedup   0.72
MATLAB: 0.019573  speedup   0.49
MATLAB: 0.019951  speedup   0.56
MATLAB: 0.020539  speedup   1.00
MATLAB: 0.035551  speedup   1.28
MATLAB: 0.017770  speedup   0.60
MATLAB: 0.021259  speedup   0.56
MATLAB: 0.019814  speedup   0.51
MATLAB: 0.019959  speedup   0.52
MATLAB: 0.019774  speedup   0.91
MATLAB: 0.028886  speedup   1.36
MATLAB: 0.025006  speedup   0.74
MATLAB: 0.019575  speedup   0.50
MATLAB: 0.017539  speedup   0.46
MATLAB: 0.019853  speedup   0.54
MATLAB: 0.019326  speedup   0.91
MATLAB: 0.023452  speedup   0.83
MATLAB: 0.017342  speedup   0.57
MATLAB: 0.021155  speedup   0.57
MATLAB: 0.020037  speedup   0.57
MATLAB: 0.019553  speedup   0.60
MATLAB: 0.021597  speedup   1.02
MATLAB: 0.031369  speedup   0.93
MATLAB: 0.018705  speedup   0.97
MATLAB: 0.034480  speedup   1.74
MATLAB: 0.023129  speedup   0.84
MATLAB: 0.018441  speedup   0.64
MATLAB: 0.020989  speedup   1.18
MATLAB: 0.035231  speedup   1.39
MATLAB: 0.017918  speedup   0.63
MATLAB: 0.022081  speedup   0.60
MATLAB: 0.018495  speedup   1.14
MATLAB: 0.027193  speedup   1.57
MATLAB: 0.015515  speedup   0.63
MATLAB: 0.019163  speedup   1.38
MATLAB: 0.029915  speedup   1.46
MATLAB: 0.017181  speedup   1.65
MATLAB: 0.022583  speedup   1.27
MATLAB: 0.014357  speedup   1.81
-------------------------- one column :
MATLAB: 0.000989  speedup   0.69
MATLAB: 0.000204  speedup   0.35
MATLAB: 0.000083  speedup   0.27
MATLAB: 0.000149  speedup   0.20
MATLAB: 0.000709  speedup   0.48
MATLAB: 0.000029  speedup   0.11
MATLAB: 0.000223  speedup   0.84
MATLAB: 0.000020  speedup   0.11
MATLAB: 0.000017  speedup   0.10
MATLAB: 0.000016  speedup   0.09
MATLAB: 0.000008  speedup   0.05
MATLAB: 0.000015  speedup   0.11
MATLAB: 0.000014  speedup   0.11
MATLAB: 0.000008  speedup   0.06
MATLAB: 0.000006  speedup   0.05
MATLAB: 0.000006  speedup   0.05
MATLAB: 0.000007  speedup   0.06
MATLAB: 0.000007  speedup   0.06
MATLAB: 0.000008  speedup   0.07
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000005  speedup   0.05
MATLAB: 0.000019  speedup   0.19
MATLAB: 0.000005  speedup   0.05
MATLAB: 0.000022  speedup   0.22
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000005  speedup   0.05
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000005  speedup   0.05
MATLAB: 0.000006  speedup   0.05
MATLAB: 0.000006  speedup   0.04
MATLAB: 0.000008  speedup   0.05
MATLAB: 0.000009  speedup   0.05
MATLAB: 0.000008  speedup   0.05
MATLAB: 0.000008  speedup   0.05
MATLAB: 0.000009  speedup   0.05
MATLAB: 0.000009  speedup   0.05
MATLAB: 0.000009  speedup   0.05
MATLAB: 0.000009  speedup   0.04
MATLAB: 0.000006  speedup   0.04
MATLAB: 0.000006  speedup   0.04
MATLAB: 0.000006  speedup   0.05
MATLAB: 0.000005  speedup   0.04
MATLAB: 0.000014  speedup   0.12
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000004  speedup   0.04
MATLAB: 0.000004  speedup   0.03
MATLAB: 0.000004  speedup   0.03
MATLAB: 0.000006  speedup   0.04
MATLAB: 0.000006  speedup   0.04
MATLAB: 0.000007  speedup   0.04
MATLAB: 0.000007  speedup   0.04
MATLAB: 0.000007  speedup   0.04
MATLAB: 0.000007  speedup   0.04
MATLAB: 0.000006  speedup   0.04
MATLAB: 0.000007  speedup   0.04
MATLAB: 0.000007  speedup   0.04
MATLAB: 0.000006  speedup   0.04
MATLAB: 0.000006  speedup   0.04
MATLAB: 0.000006  speedup   0.04
MATLAB: 0.000006  speedup   0.04
MATLAB: 0.000007  speedup   0.04
MATLAB: 0.000006  speedup   0.04
MATLAB: 0.000006  speedup   0.03
-------------------------- A (:,2:n):
Elapsed time is 0.343827 seconds.
Elapsed time is 0.935225 seconds.
-------------------------- case 6: ni large, no qsort, dupl :
Elapsed time is 0.405200 seconds.
Elapsed time is 0.724773 seconds.
-------------------------- case 7: ni large, no qsort, no dupl :
Elapsed time is 0.194939 seconds.
Elapsed time is 0.380638 seconds.
-------------------------- case 3: contig :
Elapsed time is 0.242306 seconds.
Elapsed time is 0.323014 seconds.

test43: all tests passed

--------------performance test GB_mex_subassign
nnzB: 38479
MATLAB start:
Elapsed time is 84.662667 seconds.
GraphBLAS start:
Elapsed time is 0.779572 seconds.
MATLAB start:
Elapsed time is 83.904278 seconds.
GraphBLAS start:
Elapsed time is 0.653769 seconds.

test46: all tests passed

================ ncols 1

------------------------------ C = A'*x
       1 : auto:     0.0002(d) dot:     0.0001 gus:     0.1647 heap:     0.1183 MATLAB     0.1233 speedup auto:     748.14 dot:    1803.88 gus:       0.75 heap:       1.04
      98 : auto:     0.0014(d) dot:     0.0014 gus:     0.1428 heap:     0.1204 MATLAB     0.1229 speedup auto:      89.69 dot:      87.83 gus:       0.86 heap:       1.02
     769 : auto:     0.0023(d) dot:     0.0022 gus:     0.1443 heap:     0.1562 MATLAB     0.1222 speedup auto:      53.08 dot:      55.29 gus:       0.85 heap:       0.78
    1286 : auto:     0.0032(d) dot:     0.0029 gus:     0.1418 heap:     0.1890 MATLAB     0.1267 speedup auto:      39.90 dot:      43.25 gus:       0.89 heap:       0.67
    1254 : auto:     0.0028(d) dot:     0.0028 gus:     0.1444 heap:     0.1869 MATLAB     0.1252 speedup auto:      44.30 dot:      45.25 gus:       0.87 heap:       0.67
    2000 : auto:     0.0041(d) dot:     0.0044 gus:     0.1432 heap:     0.2235 MATLAB     0.1354 speedup auto:      33.22 dot:      30.86 gus:       0.95 heap:       0.61
Elapsed time is 0.005961 seconds.

------------------------------ C = A*x
       1 : auto:     0.0000(h) dot:     0.1490 gus:     0.0000 heap:     0.0000 MATLAB     0.0008 speedup auto:      36.11 dot:       0.01 gus:      35.47 heap:      59.77
      99 : auto:     0.0003(G) dot:     0.1393 gus:     0.0003 heap:     0.0022 MATLAB     0.0004 speedup auto:       1.21 dot:       0.00 gus:       1.16 heap:       0.16
     778 : auto:     0.0021(G) dot:     0.1391 gus:     0.0027 heap:     0.0374 MATLAB     0.0023 speedup auto:       1.11 dot:       0.02 gus:       0.87 heap:       0.06
    1265 : auto:     0.0031(G) dot:     0.1382 gus:     0.0031 heap:     0.0673 MATLAB     0.0040 speedup auto:       1.31 dot:       0.03 gus:       1.30 heap:       0.06
    1291 : auto:     0.0033(G) dot:     0.1413 gus:     0.0031 heap:     0.0655 MATLAB     0.0040 speedup auto:       1.21 dot:       0.03 gus:       1.31 heap:       0.06
    2000 : auto:     0.0048(G) dot:     0.1479 gus:     0.0047 heap:     0.0963 MATLAB     0.0080 speedup auto:       1.67 dot:       0.05 gus:       1.71 heap:       0.08
Elapsed time is 0.008501 seconds.

------------------------------ C = x'*A
       1 : auto:     0.0002(d) dot:     0.0001 gus:     0.0054 heap:     0.0046 MATLAB     0.0041 speedup auto:      18.06 dot:      49.51 gus:       0.77 heap:       0.91
      96 : auto:     0.0014(d) dot:     0.0013 gus:     0.0515 heap:     0.0376 MATLAB     0.0090 speedup auto:       6.66 dot:       6.81 gus:       0.17 heap:       0.24
     777 : auto:     0.0023(d) dot:     0.0025 gus:     0.0226 heap:     0.0376 MATLAB     0.0276 speedup auto:      11.97 dot:      10.92 gus:       1.22 heap:       0.73
    1259 : auto:     0.0029(d) dot:     0.0037 gus:     0.0199 heap:     0.0583 MATLAB     0.0232 speedup auto:       8.05 dot:       6.21 gus:       1.17 heap:       0.40
    1257 : auto:     0.0033(d) dot:     0.0033 gus:     0.0179 heap:     0.0490 MATLAB     0.0229 speedup auto:       6.92 dot:       6.84 gus:       1.28 heap:       0.47
    2000 : auto:     0.0067(d) dot:     0.0046 gus:     0.0136 heap:     0.0647 MATLAB     0.0207 speedup auto:       3.08 dot:       4.47 gus:       1.52 heap:       0.32
Elapsed time is 0.006034 seconds.

------------------------------ C = x'*A'
       1 : auto:     0.1412(d) dot:     0.1182 gus:     0.1253 heap:     0.1266 MATLAB     0.0001 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
      98 : auto:     0.1428(d) dot:     0.1197 gus:     0.1514 heap:     0.1519 MATLAB     0.0004 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
     781 : auto:     0.1454(d) dot:     0.1221 gus:     0.1373 heap:     0.1448 MATLAB     0.0022 speedup auto:       0.02 dot:       0.02 gus:       0.02 heap:       0.02
    1288 : auto:     0.1404(d) dot:     0.1228 gus:     0.1409 heap:     0.1586 MATLAB     0.0043 speedup auto:       0.03 dot:       0.04 gus:       0.03 heap:       0.03
    1281 : auto:     0.1464(d) dot:     0.1218 gus:     0.1402 heap:     0.1633 MATLAB     0.0038 speedup auto:       0.03 dot:       0.03 gus:       0.03 heap:       0.02
    2000 : auto:     0.1469(d) dot:     0.1227 gus:     0.1318 heap:     0.1780 MATLAB     0.0055 speedup auto:       0.04 dot:       0.05 gus:       0.04 heap:       0.03
Elapsed time is 0.004861 seconds.

------------------------------ C = A'*x'
       1 : auto:     0.0003(d) dot:     0.0002 gus:     0.1439 heap:     0.1204 MATLAB     0.0053 speedup auto:      20.30 dot:      31.04 gus:       0.04 heap:       0.04
      95 : auto:     0.0013(d) dot:     0.0013 gus:     0.1355 heap:     0.1210 MATLAB     0.0108 speedup auto:       8.14 dot:       8.58 gus:       0.08 heap:       0.09
     797 : auto:     0.0023(d) dot:     0.0023 gus:     0.1396 heap:     0.1800 MATLAB     0.0362 speedup auto:      15.72 dot:      15.57 gus:       0.26 heap:       0.20
    1280 : auto:     0.0033(d) dot:     0.0032 gus:     0.1451 heap:     0.1874 MATLAB     0.0247 speedup auto:       7.58 dot:       7.63 gus:       0.17 heap:       0.13
    1272 : auto:     0.0033(d) dot:     0.0033 gus:     0.1449 heap:     0.1871 MATLAB     0.0208 speedup auto:       6.36 dot:       6.31 gus:       0.14 heap:       0.11
    2000 : auto:     0.0044(d) dot:     0.0043 gus:     0.1423 heap:     0.2325 MATLAB     0.0156 speedup auto:       3.53 dot:       3.61 gus:       0.11 heap:       0.07
Elapsed time is 0.003381 seconds.

================ ncols 2

------------------------------ C = A'*x
       2 : auto:     0.0003(d) dot:     0.0002 gus:     0.1403 heap:     0.1195 MATLAB     0.1223 speedup auto:     475.64 dot:     541.62 gus:       0.87 heap:       1.02
     194 : auto:     0.0026(d) dot:     0.0025 gus:     0.1421 heap:     0.1201 MATLAB     0.1238 speedup auto:      47.15 dot:      48.95 gus:       0.87 heap:       1.03
    1589 : auto:     0.0046(d) dot:     0.0059 gus:     0.1472 heap:     0.2015 MATLAB     0.1293 speedup auto:      27.93 dot:      22.08 gus:       0.88 heap:       0.64
    2544 : auto:     0.0057(d) dot:     0.0058 gus:     0.1457 heap:     0.2541 MATLAB     0.1321 speedup auto:      23.19 dot:      22.70 gus:       0.91 heap:       0.52
    2515 : auto:     0.0073(d) dot:     0.0059 gus:     0.1539 heap:     0.2543 MATLAB     0.1312 speedup auto:      17.97 dot:      22.41 gus:       0.85 heap:       0.52
    4000 : auto:     0.0084(d) dot:     0.0078 gus:     0.1506 heap:     0.3397 MATLAB     0.1379 speedup auto:      16.36 dot:      17.65 gus:       0.92 heap:       0.41
Elapsed time is 0.005129 seconds.

------------------------------ C = A*x
       2 : auto:     0.0000(h) dot:     0.1415 gus:     0.0000 heap:     0.0000 MATLAB     0.0004 speedup auto:      17.30 dot:       0.00 gus:      11.98 heap:      19.71
     197 : auto:     0.0006(G) dot:     0.1417 gus:     0.0006 heap:     0.0035 MATLAB     0.0009 speedup auto:       1.57 dot:       0.01 gus:       1.47 heap:       0.26
    1592 : auto:     0.0041(G) dot:     0.1471 gus:     0.0038 heap:     0.0769 MATLAB     0.0043 speedup auto:       1.05 dot:       0.03 gus:       1.13 heap:       0.06
    2508 : auto:     0.0064(G) dot:     0.1441 gus:     0.0069 heap:     0.1363 MATLAB     0.0072 speedup auto:       1.13 dot:       0.05 gus:       1.04 heap:       0.05
    2524 : auto:     0.0058(G) dot:     0.1504 gus:     0.0062 heap:     0.1326 MATLAB     0.0074 speedup auto:       1.27 dot:       0.05 gus:       1.21 heap:       0.06
    4000 : auto:     0.0100(G) dot:     0.1463 gus:     0.0095 heap:     0.1995 MATLAB     0.0128 speedup auto:       1.28 dot:       0.09 gus:       1.35 heap:       0.06
Elapsed time is 0.006084 seconds.

------------------------------ C = x'*A
       2 : auto:     0.0003(d) dot:     0.0001 gus:     0.0059 heap:     0.0047 MATLAB     0.0043 speedup auto:      15.14 dot:      41.64 gus:       0.73 heap:       0.92
     194 : auto:     0.0020(d) dot:     0.0020 gus:     0.0513 heap:     0.0506 MATLAB     0.0096 speedup auto:       4.83 dot:       4.86 gus:       0.19 heap:       0.19
    1567 : auto:     0.0038(d) dot:     0.0037 gus:     0.0137 heap:     0.1770 MATLAB     0.0351 speedup auto:       9.34 dot:       9.53 gus:       2.57 heap:       0.20
    2558 : auto:     0.0080(d) dot:     0.0065 gus:     0.0346 heap:     0.1152 MATLAB     0.0404 speedup auto:       5.06 dot:       6.23 gus:       1.17 heap:       0.35
    2504 : auto:     0.0054(d) dot:     0.0054 gus:     0.0147 heap:     0.1365 MATLAB     0.0312 speedup auto:       5.76 dot:       5.75 gus:       2.12 heap:       0.23
    4000 : auto:     0.0076(d) dot:     0.0077 gus:     0.0160 heap:     0.0983 MATLAB     0.0227 speedup auto:       2.97 dot:       2.97 gus:       1.42 heap:       0.23
Elapsed time is 0.005426 seconds.

------------------------------ C = x'*A'
       2 : auto:     0.1414(d) dot:     0.1178 gus:     0.1293 heap:     0.1249 MATLAB     0.0003 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
     196 : auto:     0.1445(d) dot:     0.1201 gus:     0.1582 heap:     0.1661 MATLAB     0.0010 speedup auto:       0.01 dot:       0.01 gus:       0.01 heap:       0.01
    1570 : auto:     0.1420(d) dot:     0.1221 gus:     0.1320 heap:     0.1808 MATLAB     0.0049 speedup auto:       0.03 dot:       0.04 gus:       0.04 heap:       0.03
    2515 : auto:     0.1447(d) dot:     0.1255 gus:     0.1364 heap:     0.1978 MATLAB     0.0098 speedup auto:       0.07 dot:       0.08 gus:       0.07 heap:       0.05
    2528 : auto:     0.1500(d) dot:     0.1268 gus:     0.1326 heap:     0.1972 MATLAB     0.0090 speedup auto:       0.06 dot:       0.07 gus:       0.07 heap:       0.05
    4000 : auto:     0.1451(d) dot:     0.1297 gus:     0.1343 heap:     0.1972 MATLAB     0.0136 speedup auto:       0.09 dot:       0.11 gus:       0.10 heap:       0.07
Elapsed time is 0.005315 seconds.

------------------------------ C = A'*x'
       2 : auto:     0.0002(d) dot:     0.0001 gus:     0.1446 heap:     0.1163 MATLAB     0.0063 speedup auto:      32.57 dot:      59.47 gus:       0.04 heap:       0.05
     197 : auto:     0.0025(d) dot:     0.0024 gus:     0.1404 heap:     0.1475 MATLAB     0.0139 speedup auto:       5.58 dot:       5.70 gus:       0.10 heap:       0.09
    1561 : auto:     0.0056(d) dot:     0.0057 gus:     0.1832 heap:     0.2033 MATLAB     0.0144 speedup auto:       2.57 dot:       2.55 gus:       0.08 heap:       0.07
    2543 : auto:     0.0068(d) dot:     0.0062 gus:     0.1538 heap:     0.2571 MATLAB     0.0174 speedup auto:       2.55 dot:       2.79 gus:       0.11 heap:       0.07
    2549 : auto:     0.0056(d) dot:     0.0056 gus:     0.1478 heap:     0.2612 MATLAB     0.0189 speedup auto:       3.37 dot:       3.38 gus:       0.13 heap:       0.07
    4000 : auto:     0.0083(d) dot:     0.0085 gus:     0.1486 heap:     0.3301 MATLAB     0.0207 speedup auto:       2.50 dot:       2.43 gus:       0.14 heap:       0.06
Elapsed time is 0.004968 seconds.

================ ncols 3

------------------------------ C = A'*x
       3 : auto:     0.0002(d) dot:     0.0001 gus:     0.1398 heap:     0.1175 MATLAB     0.1259 speedup auto:     553.08 dot:     903.42 gus:       0.90 heap:       1.07
     294 : auto:     0.0037(d) dot:     0.0036 gus:     0.1427 heap:     0.1286 MATLAB     0.1222 speedup auto:      33.28 dot:      34.18 gus:       0.86 heap:       0.95
    2378 : auto:     0.0067(d) dot:     0.0066 gus:     0.1447 heap:     0.2356 MATLAB     0.1305 speedup auto:      19.52 dot:      19.68 gus:       0.90 heap:       0.55
    3806 : auto:     0.0091(d) dot:     0.0093 gus:     0.1534 heap:     0.3211 MATLAB     0.1330 speedup auto:      14.65 dot:      14.33 gus:       0.87 heap:       0.41
    3777 : auto:     0.0084(d) dot:     0.0084 gus:     0.1497 heap:     0.3306 MATLAB     0.1343 speedup auto:      15.99 dot:      16.08 gus:       0.90 heap:       0.41
    6000 : auto:     0.0128(d) dot:     0.0126 gus:     0.1565 heap:     0.4413 MATLAB     0.1467 speedup auto:      11.49 dot:      11.66 gus:       0.94 heap:       0.33
Elapsed time is 0.007400 seconds.

------------------------------ C = A*x
       3 : auto:     0.0000(h) dot:     0.1476 gus:     0.0000 heap:     0.0000 MATLAB     0.0000 speedup auto:       1.30 dot:       0.00 gus:       1.09 heap:       1.88
     295 : auto:     0.0008(G) dot:     0.1470 gus:     0.0009 heap:     0.0056 MATLAB     0.0011 speedup auto:       1.31 dot:       0.01 gus:       1.26 heap:       0.20
    2367 : auto:     0.0056(G) dot:     0.1460 gus:     0.0076 heap:     0.1165 MATLAB     0.0069 speedup auto:       1.22 dot:       0.05 gus:       0.92 heap:       0.06
    3806 : auto:     0.0094(G) dot:     0.1500 gus:     0.0098 heap:     0.2028 MATLAB     0.0124 speedup auto:       1.31 dot:       0.08 gus:       1.26 heap:       0.06
    3781 : auto:     0.0102(G) dot:     0.1538 gus:     0.0088 heap:     0.2101 MATLAB     0.0119 speedup auto:       1.16 dot:       0.08 gus:       1.35 heap:       0.06
    6000 : auto:     0.0148(G) dot:     0.1529 gus:     0.0143 heap:     0.3036 MATLAB     0.0166 speedup auto:       1.12 dot:       0.11 gus:       1.16 heap:       0.05
Elapsed time is 0.007751 seconds.

------------------------------ C = x'*A
       3 : auto:     0.0003(d) dot:     0.0001 gus:     0.0056 heap:     0.0048 MATLAB     0.0040 speedup auto:      12.99 dot:      39.79 gus:       0.71 heap:       0.83
     299 : auto:     0.0026(d) dot:     0.0026 gus:     0.0165 heap:     0.0244 MATLAB     0.0218 speedup auto:       8.36 dot:       8.44 gus:       1.32 heap:       0.89
    2321 : auto:     0.0051(d) dot:     0.0051 gus:     0.0134 heap:     0.1348 MATLAB     0.0295 speedup auto:       5.81 dot:       5.81 gus:       2.20 heap:       0.22
    3787 : auto:     0.0092(d) dot:     0.0090 gus:     0.0257 heap:     0.1500 MATLAB     0.0331 speedup auto:       3.58 dot:       3.66 gus:       1.29 heap:       0.22
    3817 : auto:     0.0075(d) dot:     0.0075 gus:     0.0159 heap:     0.1535 MATLAB     0.0331 speedup auto:       4.41 dot:       4.39 gus:       2.08 heap:       0.22
    6000 : auto:     0.0114(d) dot:     0.0114 gus:     0.0286 heap:     0.1134 MATLAB     0.0250 speedup auto:       2.20 dot:       2.18 gus:       0.87 heap:       0.22
Elapsed time is 0.006591 seconds.

------------------------------ C = x'*A'
       3 : auto:     0.1387(d) dot:     0.1231 gus:     0.1234 heap:     0.1271 MATLAB     0.0001 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
     291 : auto:     0.1458(d) dot:     0.1234 gus:     0.1262 heap:     0.1517 MATLAB     0.0011 speedup auto:       0.01 dot:       0.01 gus:       0.01 heap:       0.01
    2349 : auto:     0.1497(d) dot:     0.1254 gus:     0.1341 heap:     0.2290 MATLAB     0.0064 speedup auto:       0.04 dot:       0.05 gus:       0.05 heap:       0.03
    3803 : auto:     0.1478(d) dot:     0.1249 gus:     0.1338 heap:     0.2592 MATLAB     0.0106 speedup auto:       0.07 dot:       0.08 gus:       0.08 heap:       0.04
    3800 : auto:     0.1506(d) dot:     0.1260 gus:     0.1345 heap:     0.2553 MATLAB     0.0109 speedup auto:       0.07 dot:       0.09 gus:       0.08 heap:       0.04
    6000 : auto:     0.1666(d) dot:     0.1336 gus:     0.1473 heap:     0.2249 MATLAB     0.0158 speedup auto:       0.09 dot:       0.12 gus:       0.11 heap:       0.07
Elapsed time is 0.007889 seconds.

------------------------------ C = A'*x'
       3 : auto:     0.0004(d) dot:     0.0002 gus:     0.1414 heap:     0.1200 MATLAB     0.0053 speedup auto:      13.65 dot:      30.15 gus:       0.04 heap:       0.04
     292 : auto:     0.0038(d) dot:     0.0041 gus:     0.1416 heap:     0.1236 MATLAB     0.0098 speedup auto:       2.59 dot:       2.40 gus:       0.07 heap:       0.08
    2362 : auto:     0.0066(d) dot:     0.0065 gus:     0.1524 heap:     0.2310 MATLAB     0.0161 speedup auto:       2.46 dot:       2.47 gus:       0.11 heap:       0.07
    3792 : auto:     0.0098(d) dot:     0.0094 gus:     0.1502 heap:     0.3268 MATLAB     0.0319 speedup auto:       3.25 dot:       3.39 gus:       0.21 heap:       0.10
    3811 : auto:     0.0088(d) dot:     0.0083 gus:     0.1530 heap:     0.3293 MATLAB     0.0305 speedup auto:       3.48 dot:       3.69 gus:       0.20 heap:       0.09
    6000 : auto:     0.0118(d) dot:     0.0117 gus:     0.1535 heap:     0.4431 MATLAB     0.0308 speedup auto:       2.61 dot:       2.64 gus:       0.20 heap:       0.07
Elapsed time is 0.006745 seconds.

================ ncols 4

------------------------------ C = A'*x
       4 : auto:     0.0005(d) dot:     0.0004 gus:     0.1409 heap:     0.1186 MATLAB     0.1205 speedup auto:     262.24 dot:     297.46 gus:       0.86 heap:       1.02
     394 : auto:     0.0053(d) dot:     0.0050 gus:     0.1416 heap:     0.1298 MATLAB     0.1309 speedup auto:      24.78 dot:      26.18 gus:       0.92 heap:       1.01
    3141 : auto:     0.0104(d) dot:     0.0103 gus:     0.1523 heap:     0.2691 MATLAB     0.1305 speedup auto:      12.53 dot:      12.65 gus:       0.86 heap:       0.48
    5060 : auto:     0.0113(d) dot:     0.0134 gus:     0.1531 heap:     0.3978 MATLAB     0.1411 speedup auto:      12.48 dot:      10.55 gus:       0.92 heap:       0.35
    5058 : auto:     0.0132(d) dot:     0.0119 gus:     0.1534 heap:     0.3899 MATLAB     0.1501 speedup auto:      11.35 dot:      12.65 gus:       0.98 heap:       0.38
    8000 : auto:     0.0165(d) dot:     0.0158 gus:     0.1635 heap:     0.5506 MATLAB     0.1430 speedup auto:       8.68 dot:       9.03 gus:       0.87 heap:       0.26
Elapsed time is 0.007325 seconds.

------------------------------ C = A*x
       4 : auto:     0.0000(h) dot:     0.1395 gus:     0.0000 heap:     0.0000 MATLAB     0.0000 speedup auto:       1.64 dot:       0.00 gus:       1.08 heap:       1.89
     392 : auto:     0.0011(G) dot:     0.1457 gus:     0.0011 heap:     0.0087 MATLAB     0.0013 speedup auto:       1.20 dot:       0.01 gus:       1.19 heap:       0.15
    3158 : auto:     0.0081(G) dot:     0.1514 gus:     0.0095 heap:     0.1522 MATLAB     0.0087 speedup auto:       1.07 dot:       0.06 gus:       0.92 heap:       0.06
    5081 : auto:     0.0122(G) dot:     0.1489 gus:     0.0122 heap:     0.2688 MATLAB     0.0142 speedup auto:       1.16 dot:       0.10 gus:       1.16 heap:       0.05
    5066 : auto:     0.0117(G) dot:     0.1563 gus:     0.0171 heap:     0.2755 MATLAB     0.0171 speedup auto:       1.46 dot:       0.11 gus:       1.00 heap:       0.06
    8000 : auto:     0.0181(G) dot:     0.1607 gus:     0.0208 heap:     0.4086 MATLAB     0.0227 speedup auto:       1.26 dot:       0.14 gus:       1.09 heap:       0.06
Elapsed time is 0.008359 seconds.

------------------------------ C = x'*A
       4 : auto:     0.0003(d) dot:     0.0001 gus:     0.0054 heap:     0.0049 MATLAB     0.0040 speedup auto:      11.98 dot:      28.48 gus:       0.74 heap:       0.83
     396 : auto:     0.0030(d) dot:     0.0037 gus:     0.0197 heap:     0.0256 MATLAB     0.0113 speedup auto:       3.74 dot:       3.03 gus:       0.58 heap:       0.44
    3135 : auto:     0.0072(d) dot:     0.0153 gus:     0.0222 heap:     0.1735 MATLAB     0.0350 speedup auto:       4.84 dot:       2.28 gus:       1.58 heap:       0.20
    5072 : auto:     0.0105(d) dot:     0.0098 gus:     0.0252 heap:     0.2003 MATLAB     0.0395 speedup auto:       3.77 dot:       4.01 gus:       1.57 heap:       0.20
    5042 : auto:     0.0098(d) dot:     0.0102 gus:     0.0304 heap:     0.2092 MATLAB     0.0406 speedup auto:       4.13 dot:       3.99 gus:       1.33 heap:       0.19
    8000 : auto:     0.0159(d) dot:     0.0156 gus:     0.0364 heap:     0.1419 MATLAB     0.0368 speedup auto:       2.32 dot:       2.35 gus:       1.01 heap:       0.26
Elapsed time is 0.006335 seconds.

------------------------------ C = x'*A'
       4 : auto:     0.1432(d) dot:     0.1207 gus:     0.1250 heap:     0.1230 MATLAB     0.0002 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
     388 : auto:     0.1491(d) dot:     0.1273 gus:     0.1321 heap:     0.1398 MATLAB     0.0018 speedup auto:       0.01 dot:       0.01 gus:       0.01 heap:       0.01
    3175 : auto:     0.1472(d) dot:     0.1265 gus:     0.1344 heap:     0.2831 MATLAB     0.0100 speedup auto:       0.07 dot:       0.08 gus:       0.07 heap:       0.04
    5051 : auto:     0.1528(d) dot:     0.1270 gus:     0.1348 heap:     0.3027 MATLAB     0.0138 speedup auto:       0.09 dot:       0.11 gus:       0.10 heap:       0.05
    5064 : auto:     0.1535(d) dot:     0.1312 gus:     0.1414 heap:     0.3111 MATLAB     0.0176 speedup auto:       0.11 dot:       0.13 gus:       0.12 heap:       0.06
    8000 : auto:     0.1559(d) dot:     0.1351 gus:     0.1456 heap:     0.2563 MATLAB     0.0240 speedup auto:       0.15 dot:       0.18 gus:       0.16 heap:       0.09
Elapsed time is 0.007993 seconds.

------------------------------ C = A'*x'
       4 : auto:     0.0005(d) dot:     0.0004 gus:     0.1434 heap:     0.1222 MATLAB     0.0066 speedup auto:      13.84 dot:      16.47 gus:       0.05 heap:       0.05
     389 : auto:     0.0051(d) dot:     0.0052 gus:     0.1422 heap:     0.1326 MATLAB     0.0124 speedup auto:       2.45 dot:       2.40 gus:       0.09 heap:       0.09
    3172 : auto:     0.0143(d) dot:     0.0122 gus:     0.1540 heap:     0.2759 MATLAB     0.0238 speedup auto:       1.67 dot:       1.95 gus:       0.15 heap:       0.09
    5113 : auto:     0.0113(d) dot:     0.0123 gus:     0.1521 heap:     0.4125 MATLAB     0.0450 speedup auto:       3.98 dot:       3.65 gus:       0.30 heap:       0.11
    5039 : auto:     0.0127(d) dot:     0.0136 gus:     0.1550 heap:     0.3854 MATLAB     0.0376 speedup auto:       2.96 dot:       2.77 gus:       0.24 heap:       0.10
    8000 : auto:     0.0162(d) dot:     0.0162 gus:     0.1636 heap:     0.5521 MATLAB     0.0414 speedup auto:       2.56 dot:       2.56 gus:       0.25 heap:       0.08
Elapsed time is 0.005924 seconds.

================ ncols 8

------------------------------ C = A'*x
       8 : auto:     0.0006(d) dot:     0.0004 gus:     0.1382 heap:     0.1185 MATLAB     0.1204 speedup auto:     190.69 dot:     312.72 gus:       0.87 heap:       1.02
     779 : auto:     0.0101(d) dot:     0.0098 gus:     0.1439 heap:     0.1348 MATLAB     0.1242 speedup auto:      12.27 dot:      12.67 gus:       0.86 heap:       0.92
    6272 : auto:     0.0174(d) dot:     0.0305 gus:     0.1502 heap:     0.4632 MATLAB     0.1692 speedup auto:       9.70 dot:       5.54 gus:       1.13 heap:       0.37
   10022 : auto:     0.0415(d) dot:     0.0416 gus:     0.1635 heap:     0.8102 MATLAB     0.1709 speedup auto:       4.11 dot:       4.11 gus:       1.05 heap:       0.21
   10138 : auto:     0.0248(d) dot:     0.0431 gus:     0.1572 heap:     0.7176 MATLAB     0.1568 speedup auto:       6.31 dot:       3.64 gus:       1.00 heap:       0.22
   16000 : auto:     0.0322(d) dot:     0.0473 gus:     0.1745 heap:     1.0543 MATLAB     0.1803 speedup auto:       5.60 dot:       3.81 gus:       1.03 heap:       0.17
Elapsed time is 0.016668 seconds.

------------------------------ C = A*x
       8 : auto:     0.0001(h) dot:     0.1469 gus:     0.0001 heap:     0.0000 MATLAB     0.0001 speedup auto:       1.50 dot:       0.00 gus:       1.07 heap:       1.95
     784 : auto:     0.0022(G) dot:     0.1545 gus:     0.0026 heap:     0.0213 MATLAB     0.0025 speedup auto:       1.12 dot:       0.02 gus:       0.98 heap:       0.12
    6298 : auto:     0.0182(G) dot:     0.1618 gus:     0.0151 heap:     0.3294 MATLAB     0.0175 speedup auto:       0.97 dot:       0.11 gus:       1.16 heap:       0.05
   10129 : auto:     0.0270(G) dot:     0.1691 gus:     0.0280 heap:     0.5762 MATLAB     0.0367 speedup auto:       1.36 dot:       0.22 gus:       1.31 heap:       0.06
   10196 : auto:     0.0266(G) dot:     0.1721 gus:     0.0294 heap:     0.5838 MATLAB     0.0360 speedup auto:       1.36 dot:       0.21 gus:       1.22 heap:       0.06
   16000 : auto:     0.0575(G) dot:     0.1703 gus:     0.0422 heap:     0.9221 MATLAB     0.0485 speedup auto:       0.84 dot:       0.28 gus:       1.15 heap:       0.05
Elapsed time is 0.017447 seconds.

------------------------------ C = x'*A
       8 : auto:     0.0006(d) dot:     0.0006 gus:     0.0088 heap:     0.0082 MATLAB     0.0069 speedup auto:      11.54 dot:      12.22 gus:       0.79 heap:       0.85
     781 : auto:     0.0055(d) dot:     0.0044 gus:     0.0183 heap:     0.0884 MATLAB     0.0231 speedup auto:       4.18 dot:       5.28 gus:       1.26 heap:       0.26
    6276 : auto:     0.0139(d) dot:     0.0151 gus:     0.0406 heap:     0.3997 MATLAB     0.0525 speedup auto:       3.78 dot:       3.48 gus:       1.29 heap:       0.13
   10080 : auto:     0.0514(d) dot:     0.0271 gus:     0.0597 heap:     0.4904 MATLAB     0.0527 speedup auto:       1.02 dot:       1.94 gus:       0.88 heap:       0.11
   10176 : auto:     0.0226(d) dot:     0.0290 gus:     0.0522 heap:     0.4093 MATLAB     0.0541 speedup auto:       2.39 dot:       1.87 gus:       1.04 heap:       0.13
   16000 : auto:     0.0430(d) dot:     0.0335 gus:     0.0460 heap:     0.2680 MATLAB     0.0514 speedup auto:       1.19 dot:       1.53 gus:       1.12 heap:       0.19
Elapsed time is 0.015450 seconds.

------------------------------ C = x'*A'
       8 : auto:     0.1433(d) dot:     0.1167 gus:     0.1226 heap:     0.1243 MATLAB     0.0003 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
     779 : auto:     0.1439(d) dot:     0.1276 gus:     0.1353 heap:     0.1676 MATLAB     0.0025 speedup auto:       0.02 dot:       0.02 gus:       0.02 heap:       0.02
    6291 : auto:     0.1526(d) dot:     0.1352 gus:     0.1428 heap:     0.4721 MATLAB     0.0211 speedup auto:       0.14 dot:       0.16 gus:       0.15 heap:       0.04
   10124 : auto:     0.1579(d) dot:     0.1385 gus:     0.1626 heap:     0.5209 MATLAB     0.0286 speedup auto:       0.18 dot:       0.21 gus:       0.18 heap:       0.05
   10130 : auto:     0.1597(d) dot:     0.1381 gus:     0.1762 heap:     0.5788 MATLAB     0.0320 speedup auto:       0.20 dot:       0.23 gus:       0.18 heap:       0.06
   16000 : auto:     0.1725(d) dot:     0.1499 gus:     0.1582 heap:     0.4408 MATLAB     0.0515 speedup auto:       0.30 dot:       0.34 gus:       0.33 heap:       0.12
Elapsed time is 0.015807 seconds.

------------------------------ C = A'*x'
       8 : auto:     0.0008(d) dot:     0.0004 gus:     0.1408 heap:     0.1180 MATLAB     0.0054 speedup auto:       6.55 dot:      12.94 gus:       0.04 heap:       0.05
     781 : auto:     0.0099(d) dot:     0.0096 gus:     0.1422 heap:     0.1353 MATLAB     0.0104 speedup auto:       1.05 dot:       1.09 gus:       0.07 heap:       0.08
    6285 : auto:     0.0199(d) dot:     0.0309 gus:     0.1493 heap:     0.4214 MATLAB     0.0455 speedup auto:       2.29 dot:       1.47 gus:       0.30 heap:       0.11
   10044 : auto:     0.0236(d) dot:     0.0370 gus:     0.1552 heap:     0.6651 MATLAB     0.0575 speedup auto:       2.44 dot:       1.56 gus:       0.37 heap:       0.09
   10099 : auto:     0.0237(d) dot:     0.0368 gus:     0.1544 heap:     0.6703 MATLAB     0.0602 speedup auto:       2.54 dot:       1.63 gus:       0.39 heap:       0.09
   16000 : auto:     0.0455(d) dot:     0.0425 gus:     0.1594 heap:     0.9893 MATLAB     0.0544 speedup auto:       1.20 dot:       1.28 gus:       0.34 heap:       0.06
Elapsed time is 0.011713 seconds.

================ ncols 16

------------------------------ C = A'*x
      16 : auto:     0.0015(d) dot:     0.0009 gus:     0.1398 heap:     0.1251 MATLAB     0.1210 speedup auto:      82.51 dot:     128.67 gus:       0.87 heap:       0.97
    1565 : auto:     0.0234(d) dot:     0.0340 gus:     0.1349 heap:     0.1529 MATLAB     0.1276 speedup auto:       5.44 dot:       3.75 gus:       0.95 heap:       0.83
   12578 : auto:     0.0480(d) dot:     0.0528 gus:     0.1838 heap:     0.7256 MATLAB     0.1601 speedup auto:       3.34 dot:       3.03 gus:       0.87 heap:       0.22
   20250 : auto:     0.0583(d) dot:     0.0576 gus:     0.1727 heap:     1.2545 MATLAB     0.1867 speedup auto:       3.20 dot:       3.24 gus:       1.08 heap:       0.15
   20219 : auto:     0.0590(d) dot:     0.0554 gus:     0.2001 heap:     1.2867 MATLAB     0.1908 speedup auto:       3.23 dot:       3.44 gus:       0.95 heap:       0.15
   32000 : auto:     0.0776(d) dot:     0.0785 gus:     0.2041 heap:     1.8771 MATLAB     0.2228 speedup auto:       2.87 dot:       2.84 gus:       1.09 heap:       0.12
Elapsed time is 0.024905 seconds.

------------------------------ C = A*x
      16 : auto:     0.0001(h) dot:     0.1425 gus:     0.0001 heap:     0.0001 MATLAB     0.0002 speedup auto:       1.92 dot:       0.00 gus:       1.07 heap:       2.11
    1557 : auto:     0.0048(G) dot:     0.1640 gus:     0.0041 heap:     0.0414 MATLAB     0.0046 speedup auto:       0.95 dot:       0.03 gus:       1.12 heap:       0.11
   12596 : auto:     0.0446(G) dot:     0.1711 gus:     0.0315 heap:     0.6253 MATLAB     0.0416 speedup auto:       0.93 dot:       0.24 gus:       1.32 heap:       0.07
   20149 : auto:     0.0652(G) dot:     0.1725 gus:     0.0521 heap:     1.0649 MATLAB     0.0581 speedup auto:       0.89 dot:       0.34 gus:       1.11 heap:       0.05
   20224 : auto:     0.0653(G) dot:     0.1914 gus:     0.0640 heap:     1.1072 MATLAB     0.0580 speedup auto:       0.89 dot:       0.30 gus:       0.91 heap:       0.05
   32000 : auto:     0.1049(G) dot:     0.1936 gus:     0.0722 heap:     1.6130 MATLAB     0.0929 speedup auto:       0.89 dot:       0.48 gus:       1.29 heap:       0.06
Elapsed time is 0.032252 seconds.

------------------------------ C = x'*A
      16 : auto:     0.0007(d) dot:     0.0003 gus:     0.0159 heap:     0.0058 MATLAB     0.0072 speedup auto:      10.10 dot:      23.40 gus:       0.45 heap:       1.23
    1556 : auto:     0.0067(d) dot:     0.0147 gus:     0.0161 heap:     0.1468 MATLAB     0.0278 speedup auto:       4.15 dot:       1.89 gus:       1.73 heap:       0.19
   12537 : auto:     0.0244(d) dot:     0.0363 gus:     0.0544 heap:     0.7125 MATLAB     0.0591 speedup auto:       2.42 dot:       1.63 gus:       1.09 heap:       0.08
   20154 : auto:     0.0505(d) dot:     0.0464 gus:     0.0587 heap:     0.8566 MATLAB     0.0803 speedup auto:       1.59 dot:       1.73 gus:       1.37 heap:       0.09
   20261 : auto:     0.0493(d) dot:     0.0535 gus:     0.0614 heap:     0.8209 MATLAB     0.0823 speedup auto:       1.67 dot:       1.54 gus:       1.34 heap:       0.10
   32000 : auto:     0.0717(d) dot:     0.0710 gus:     0.0697 heap:     0.5826 MATLAB     0.0834 speedup auto:       1.16 dot:       1.17 gus:       1.20 heap:       0.14
Elapsed time is 0.030272 seconds.

------------------------------ C = x'*A'
      16 : auto:     0.1663(d) dot:     0.1180 gus:     0.1412 heap:     0.1239 MATLAB     0.0005 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
    1562 : auto:     0.1490(d) dot:     0.1268 gus:     0.1325 heap:     0.2495 MATLAB     0.0051 speedup auto:       0.03 dot:       0.04 gus:       0.04 heap:       0.02
   12597 : auto:     0.1781(d) dot:     0.1435 gus:     0.1663 heap:     0.8229 MATLAB     0.0379 speedup auto:       0.21 dot:       0.26 gus:       0.23 heap:       0.05
   20244 : auto:     0.1865(d) dot:     0.1637 gus:     0.1771 heap:     1.0049 MATLAB     0.0589 speedup auto:       0.32 dot:       0.36 gus:       0.33 heap:       0.06
   20282 : auto:     0.2006(d) dot:     0.1786 gus:     0.1880 heap:     0.9494 MATLAB     0.0610 speedup auto:       0.30 dot:       0.34 gus:       0.32 heap:       0.06
   32000 : auto:     0.1987(d) dot:     0.1796 gus:     0.1805 heap:     0.7040 MATLAB     0.0909 speedup auto:       0.46 dot:       0.51 gus:       0.50 heap:       0.13
Elapsed time is 0.045948 seconds.

------------------------------ C = A'*x'
      16 : auto:     0.0019(d) dot:     0.0010 gus:     0.1484 heap:     0.1222 MATLAB     0.0060 speedup auto:       3.22 dot:       5.72 gus:       0.04 heap:       0.05
    1560 : auto:     0.0199(d) dot:     0.0319 gus:     0.1402 heap:     0.1560 MATLAB     0.0156 speedup auto:       0.78 dot:       0.49 gus:       0.11 heap:       0.10
   12543 : auto:     0.0484(d) dot:     0.0531 gus:     0.1529 heap:     0.7959 MATLAB     0.0618 speedup auto:       1.28 dot:       1.16 gus:       0.40 heap:       0.08
   20221 : auto:     0.0642(d) dot:     0.0562 gus:     0.1758 heap:     1.2519 MATLAB     0.0865 speedup auto:       1.35 dot:       1.54 gus:       0.49 heap:       0.07
   20165 : auto:     0.0610(d) dot:     0.0574 gus:     0.1928 heap:     1.1955 MATLAB     0.0804 speedup auto:       1.32 dot:       1.40 gus:       0.42 heap:       0.07
   32000 : auto:     0.0782(d) dot:     0.0764 gus:     0.1947 heap:     1.8444 MATLAB     0.0925 speedup auto:       1.18 dot:       1.21 gus:       0.48 heap:       0.05
Elapsed time is 0.050787 seconds.

Prob = 

  struct with fields:

         A: [7200072000 double]
      name: 'ND/nd24k'
     title: 'ND problem set, matrix nd24k'
        id: 939
      date: '2003'
    author: 'author unknown'
        ed: 'T. Davis'
      kind: '2D/3D problem'


================ ncols 1

------------------------------ C = A'*x
       1 : auto:     0.0015(d) dot:     0.0015 gus:     0.6773 heap:     0.6412 MATLAB     0.6583 speedup auto:     449.61 dot:     451.17 gus:       0.97 heap:       1.03
     100 : auto:     0.0636(d) dot:     0.0582 gus:     0.6474 heap:     0.6382 MATLAB     0.6826 speedup auto:      10.73 dot:      11.73 gus:       1.05 heap:       1.07
     989 : auto:     0.0766(d) dot:     0.0721 gus:     0.6454 heap:     0.6544 MATLAB     0.6453 speedup auto:       8.43 dot:       8.95 gus:       1.00 heap:       0.99
    4829 : auto:     0.0983(d) dot:     0.0815 gus:     0.6466 heap:     0.6878 MATLAB     0.6597 speedup auto:       6.71 dot:       8.09 gus:       1.02 heap:       0.96
   45542 : auto:     0.1614(d) dot:     0.1429 gus:     0.6956 heap:     1.2048 MATLAB     0.7035 speedup auto:       4.36 dot:       4.92 gus:       1.01 heap:       0.58
   72000 : auto:     0.0669(d) dot:     0.0672 gus:     0.7039 heap:     1.5335 MATLAB     0.7167 speedup auto:      10.71 dot:      10.67 gus:       1.02 heap:       0.47
Elapsed time is 0.026710 seconds.

------------------------------ C = A*x
       1 : auto:     0.0000(h) dot:     0.7126 gus:     0.0000 heap:     0.0000 MATLAB     0.0006 speedup auto:      46.26 dot:       0.00 gus:      14.87 heap:      53.94
     100 : auto:     0.0007(h) dot:     0.7325 gus:     0.0012 heap:     0.0008 MATLAB     0.0020 speedup auto:       3.11 dot:       0.00 gus:       1.64 heap:       2.59
     996 : auto:     0.0053(G) dot:     0.7242 gus:     0.0047 heap:     0.0105 MATLAB     0.0044 speedup auto:       0.84 dot:       0.01 gus:       0.94 heap:       0.42
    4836 : auto:     0.0083(G) dot:     0.7441 gus:     0.0065 heap:     0.0542 MATLAB     0.0107 speedup auto:       1.29 dot:       0.01 gus:       1.65 heap:       0.20
   45566 : auto:     0.0572(G) dot:     0.7942 gus:     0.0437 heap:     0.5492 MATLAB     0.0532 speedup auto:       0.93 dot:       0.07 gus:       1.22 heap:       0.10
   72000 : auto:     0.0753(G) dot:     0.7590 gus:     0.0633 heap:     0.8762 MATLAB     0.0781 speedup auto:       1.04 dot:       0.10 gus:       1.23 heap:       0.09
Elapsed time is 0.048242 seconds.

------------------------------ C = x'*A
       1 : auto:     0.0057(d) dot:     0.0043 gus:     0.0789 heap:     0.0604 MATLAB     0.0693 speedup auto:      12.12 dot:      16.08 gus:       0.88 heap:       1.15
     100 : auto:     0.0946(d) dot:     0.0783 gus:     0.2701 heap:     0.2287 MATLAB     0.0706 speedup auto:       0.75 dot:       0.90 gus:       0.26 heap:       0.31
     990 : auto:     0.1627(d) dot:     0.1435 gus:     0.4163 heap:     0.3582 MATLAB     0.0753 speedup auto:       0.46 dot:       0.52 gus:       0.18 heap:       0.21
    4828 : auto:     0.1620(d) dot:     0.1481 gus:     0.4857 heap:     0.4890 MATLAB     0.0891 speedup auto:       0.55 dot:       0.60 gus:       0.18 heap:       0.18
   45553 : auto:     0.4173(d) dot:     0.3816 gus:     0.1669 heap:     0.4009 MATLAB     0.1896 speedup auto:       0.45 dot:       0.50 gus:       1.14 heap:       0.47
   72000 : auto:     0.0667(d) dot:     0.0627 gus:     0.1042 heap:     0.3963 MATLAB     0.1205 speedup auto:       1.81 dot:       1.92 gus:       1.16 heap:       0.30
Elapsed time is 0.054296 seconds.

------------------------------ C = x'*A'
       1 : auto:     0.6640(d) dot:     0.6570 gus:     0.6777 heap:     0.6938 MATLAB     0.0004 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
     100 : auto:     0.7609(d) dot:     0.7369 gus:     0.9035 heap:     0.8597 MATLAB     0.0015 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
     992 : auto:     0.8189(d) dot:     0.7832 gus:     1.0437 heap:     0.9995 MATLAB     0.0028 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
    4831 : auto:     0.8146(d) dot:     0.8165 gus:     1.1196 heap:     1.1414 MATLAB     0.0081 speedup auto:       0.01 dot:       0.01 gus:       0.01 heap:       0.01
   45543 : auto:     1.0483(d) dot:     1.0227 gus:     0.8169 heap:     1.0542 MATLAB     0.0492 speedup auto:       0.05 dot:       0.05 gus:       0.06 heap:       0.05
   72000 : auto:     0.7271(d) dot:     0.6938 gus:     0.7631 heap:     1.0105 MATLAB     0.0869 speedup auto:       0.12 dot:       0.13 gus:       0.11 heap:       0.09
Elapsed time is 0.057178 seconds.

------------------------------ C = A'*x'
       1 : auto:     0.0024(d) dot:     0.0023 gus:     0.6816 heap:     0.6494 MATLAB     0.0778 speedup auto:      32.86 dot:      33.17 gus:       0.11 heap:       0.12
     100 : auto:     0.0575(d) dot:     0.0631 gus:     0.6699 heap:     0.6289 MATLAB     0.0814 speedup auto:       1.42 dot:       1.29 gus:       0.12 heap:       0.13
     990 : auto:     0.0793(d) dot:     0.0733 gus:     0.6445 heap:     0.6404 MATLAB     0.0806 speedup auto:       1.02 dot:       1.10 gus:       0.13 heap:       0.13
    4813 : auto:     0.1001(d) dot:     0.0802 gus:     0.6485 heap:     0.7056 MATLAB     0.0881 speedup auto:       0.88 dot:       1.10 gus:       0.14 heap:       0.12
   45526 : auto:     0.1757(d) dot:     0.1429 gus:     0.6733 heap:     1.2157 MATLAB     0.1911 speedup auto:       1.09 dot:       1.34 gus:       0.28 heap:       0.16
   72000 : auto:     0.0704(d) dot:     0.0679 gus:     0.6996 heap:     1.5281 MATLAB     0.1337 speedup auto:       1.90 dot:       1.97 gus:       0.19 heap:       0.09
Elapsed time is 0.031813 seconds.

================ ncols 2

------------------------------ C = A'*x
       2 : auto:     0.6594(h) dot:     0.0034 gus:     0.6387 heap:     0.6254 MATLAB     0.6323 speedup auto:       0.96 dot:     185.68 gus:       0.99 heap:       1.01
     200 : auto:     0.6900(h) dot:     0.0978 gus:     0.6551 heap:     0.6485 MATLAB     0.6464 speedup auto:       0.94 dot:       6.61 gus:       0.99 heap:       1.00
    1984 : auto:     0.6780(G) dot:     0.1293 gus:     0.6416 heap:     0.6605 MATLAB     0.6590 speedup auto:       0.97 dot:       5.10 gus:       1.03 heap:       1.00
    9655 : auto:     0.6759(G) dot:     0.1543 gus:     0.6664 heap:     0.7438 MATLAB     0.6756 speedup auto:       1.00 dot:       4.38 gus:       1.01 heap:       0.91
   91118 : auto:     0.7608(G) dot:     0.2960 gus:     0.7356 heap:     1.7520 MATLAB     0.7753 speedup auto:       1.02 dot:       2.62 gus:       1.05 heap:       0.44
  144000 : auto:     0.1324(d) dot:     0.1107 gus:     0.7775 heap:     2.4025 MATLAB     0.8202 speedup auto:       6.20 dot:       7.41 gus:       1.05 heap:       0.34
Elapsed time is 0.052481 seconds.

------------------------------ C = A*x
       2 : auto:     0.0001(h) dot:     0.6899 gus:     0.0001 heap:     0.0001 MATLAB     0.0003 speedup auto:       3.27 dot:       0.00 gus:       3.01 heap:       4.90
     200 : auto:     0.0014(h) dot:     0.7580 gus:     0.0024 heap:     0.0018 MATLAB     0.0039 speedup auto:       2.72 dot:       0.01 gus:       1.64 heap:       2.16
    1983 : auto:     0.0082(G) dot:     0.7894 gus:     0.0083 heap:     0.0226 MATLAB     0.0089 speedup auto:       1.09 dot:       0.01 gus:       1.08 heap:       0.39
    9665 : auto:     0.0158(G) dot:     0.8378 gus:     0.0156 heap:     0.1163 MATLAB     0.0235 speedup auto:       1.48 dot:       0.03 gus:       1.51 heap:       0.20
   91054 : auto:     0.1080(G) dot:     0.9415 gus:     0.0861 heap:     1.0940 MATLAB     0.1150 speedup auto:       1.06 dot:       0.12 gus:       1.34 heap:       0.11
  144000 : auto:     0.1496(G) dot:     0.7603 gus:     0.1243 heap:     1.7597 MATLAB     0.1509 speedup auto:       1.01 dot:       0.20 gus:       1.21 heap:       0.09
Elapsed time is 0.054919 seconds.

------------------------------ C = x'*A
       2 : auto:     0.0877(G) dot:     0.0092 gus:     0.0561 heap:     0.0527 MATLAB     0.0707 speedup auto:       0.81 dot:       7.68 gus:       1.26 heap:       1.34
     200 : auto:     0.4321(G) dot:     0.0839 gus:     0.4126 heap:     0.2502 MATLAB     0.0785 speedup auto:       0.18 dot:       0.94 gus:       0.19 heap:       0.31
    1988 : auto:     0.5383(G) dot:     0.2081 gus:     0.5394 heap:     0.4142 MATLAB     0.0899 speedup auto:       0.17 dot:       0.43 gus:       0.17 heap:       0.22
    9664 : auto:     0.1178(G) dot:     0.2745 gus:     0.0869 heap:     0.1791 MATLAB     0.0991 speedup auto:       0.84 dot:       0.36 gus:       1.14 heap:       0.55
   90891 : auto:     0.2195(G) dot:     0.8154 gus:     0.2021 heap:     0.7648 MATLAB     0.2413 speedup auto:       1.10 dot:       0.30 gus:       1.19 heap:       0.32
  144000 : auto:     0.1108(d) dot:     0.0954 gus:     0.1510 heap:     0.5754 MATLAB     0.1781 speedup auto:       1.61 dot:       1.87 gus:       1.18 heap:       0.31
Elapsed time is 0.052381 seconds.

------------------------------ C = x'*A'
       2 : auto:     0.7253(G) dot:     0.6541 gus:     0.6951 heap:     0.6977 MATLAB     0.0002 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
     200 : auto:     1.0989(G) dot:     0.7282 gus:     1.0673 heap:     0.8738 MATLAB     0.0011 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
    1981 : auto:     1.2012(G) dot:     0.8448 gus:     1.3663 heap:     1.1185 MATLAB     0.0079 speedup auto:       0.01 dot:       0.01 gus:       0.01 heap:       0.01
    9691 : auto:     0.8120(G) dot:     0.9977 gus:     0.7952 heap:     0.8401 MATLAB     0.0235 speedup auto:       0.03 dot:       0.02 gus:       0.03 heap:       0.03
   90985 : auto:     0.9218(G) dot:     1.5547 gus:     0.9193 heap:     1.4289 MATLAB     0.1143 speedup auto:       0.12 dot:       0.07 gus:       0.12 heap:       0.08
  144000 : auto:     0.7781(d) dot:     0.7657 gus:     0.7959 heap:     1.4547 MATLAB     0.1494 speedup auto:       0.19 dot:       0.20 gus:       0.19 heap:       0.10
Elapsed time is 0.053278 seconds.

------------------------------ C = A'*x'
       2 : auto:     0.6803(h) dot:     0.0032 gus:     0.6439 heap:     0.6407 MATLAB     0.0833 speedup auto:       0.12 dot:      26.03 gus:       0.13 heap:       0.13
     200 : auto:     0.6720(h) dot:     0.1058 gus:     0.6869 heap:     0.6571 MATLAB     0.0866 speedup auto:       0.13 dot:       0.82 gus:       0.13 heap:       0.13
    1988 : auto:     0.6642(G) dot:     0.1356 gus:     0.6585 heap:     0.6618 MATLAB     0.0959 speedup auto:       0.14 dot:       0.71 gus:       0.15 heap:       0.14
    9637 : auto:     0.6691(G) dot:     0.1531 gus:     0.6442 heap:     0.7460 MATLAB     0.1052 speedup auto:       0.16 dot:       0.69 gus:       0.16 heap:       0.14
   91014 : auto:     0.7813(G) dot:     0.2917 gus:     0.7378 heap:     1.7515 MATLAB     0.2272 speedup auto:       0.29 dot:       0.78 gus:       0.31 heap:       0.13
  144000 : auto:     0.1501(d) dot:     0.1150 gus:     0.7665 heap:     2.4360 MATLAB     0.1693 speedup auto:       1.13 dot:       1.47 gus:       0.22 heap:       0.07
Elapsed time is 0.050739 seconds.

================ ncols 3

------------------------------ C = A'*x
       3 : auto:     0.6790(h) dot:     0.0049 gus:     0.6914 heap:     0.7550 MATLAB     0.7010 speedup auto:       1.03 dot:     142.61 gus:       1.01 heap:       0.93
     300 : auto:     0.6724(h) dot:     0.1676 gus:     0.6636 heap:     0.6474 MATLAB     0.6383 speedup auto:       0.95 dot:       3.81 gus:       0.96 heap:       0.99
    2974 : auto:     0.6806(G) dot:     0.1930 gus:     0.6331 heap:     0.6818 MATLAB     0.6618 speedup auto:       0.97 dot:       3.43 gus:       1.05 heap:       0.97
   14503 : auto:     0.7060(G) dot:     0.2346 gus:     0.6658 heap:     0.8893 MATLAB     0.7262 speedup auto:       1.03 dot:       3.10 gus:       1.09 heap:       0.82
  136256 : auto:     0.8663(G) dot:     0.4552 gus:     0.9262 heap:     2.3116 MATLAB     0.8097 speedup auto:       0.93 dot:       1.78 gus:       0.87 heap:       0.35
  216000 : auto:     0.1864(d) dot:     0.1600 gus:     0.8340 heap:     3.3158 MATLAB     0.8650 speedup auto:       4.64 dot:       5.41 gus:       1.04 heap:       0.26
Elapsed time is 0.075240 seconds.

------------------------------ C = A*x
       3 : auto:     0.0004(h) dot:     0.6951 gus:     0.0001 heap:     0.0001 MATLAB     0.0003 speedup auto:       0.67 dot:       0.00 gus:       2.68 heap:       4.07
     300 : auto:     0.0019(h) dot:     0.8251 gus:     0.0037 heap:     0.0019 MATLAB     0.0037 speedup auto:       2.00 dot:       0.00 gus:       1.00 heap:       1.95
    2978 : auto:     0.0123(G) dot:     0.8735 gus:     0.0127 heap:     0.0312 MATLAB     0.0151 speedup auto:       1.23 dot:       0.02 gus:       1.18 heap:       0.48
   14485 : auto:     0.0227(G) dot:     0.9033 gus:     0.0200 heap:     0.1594 MATLAB     0.0319 speedup auto:       1.41 dot:       0.04 gus:       1.59 heap:       0.20
  136651 : auto:     0.1444(G) dot:     1.0778 gus:     0.1364 heap:     1.6645 MATLAB     0.1661 speedup auto:       1.15 dot:       0.15 gus:       1.22 heap:       0.10
  216000 : auto:     0.2221(G) dot:     0.8022 gus:     0.2025 heap:     2.6186 MATLAB     0.2393 speedup auto:       1.08 dot:       0.30 gus:       1.18 heap:       0.09
Elapsed time is 0.083995 seconds.

------------------------------ C = x'*A
       3 : auto:     0.0689(G) dot:     0.0054 gus:     0.0744 heap:     0.0556 MATLAB     0.0696 speedup auto:       1.01 dot:      12.87 gus:       0.94 heap:       1.25
     300 : auto:     0.5280(G) dot:     0.0975 gus:     0.4901 heap:     0.2659 MATLAB     0.0788 speedup auto:       0.15 dot:       0.81 gus:       0.16 heap:       0.30
    2979 : auto:     0.6149(G) dot:     0.2721 gus:     0.5899 heap:     0.4598 MATLAB     0.1026 speedup auto:       0.17 dot:       0.38 gus:       0.17 heap:       0.22
   14482 : auto:     0.1370(G) dot:     0.4387 gus:     0.1229 heap:     0.2646 MATLAB     0.1268 speedup auto:       0.93 dot:       0.29 gus:       1.03 heap:       0.48
  136513 : auto:     0.2329(G) dot:     1.2571 gus:     0.2215 heap:     1.1436 MATLAB     0.2733 speedup auto:       1.17 dot:       0.22 gus:       1.23 heap:       0.24
  216000 : auto:     0.1465(d) dot:     0.1186 gus:     0.1762 heap:     0.8013 MATLAB     0.2111 speedup auto:       1.44 dot:       1.78 gus:       1.20 heap:       0.26
Elapsed time is 0.073309 seconds.

------------------------------ C = x'*A'
       3 : auto:     0.7283(G) dot:     0.6497 gus:     0.6955 heap:     0.6901 MATLAB     0.0003 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
     299 : auto:     1.1418(G) dot:     0.7435 gus:     1.1927 heap:     0.9270 MATLAB     0.0038 speedup auto:       0.00 dot:       0.01 gus:       0.00 heap:       0.00
    2981 : auto:     1.3123(G) dot:     0.9359 gus:     1.2245 heap:     1.1049 MATLAB     0.0111 speedup auto:       0.01 dot:       0.01 gus:       0.01 heap:       0.01
   14457 : auto:     0.7698(G) dot:     1.0811 gus:     0.7531 heap:     0.9100 MATLAB     0.0260 speedup auto:       0.03 dot:       0.02 gus:       0.03 heap:       0.03
  136380 : auto:     0.8786(G) dot:     1.8847 gus:     0.8660 heap:     1.7736 MATLAB     0.1586 speedup auto:       0.18 dot:       0.08 gus:       0.18 heap:       0.09
  216000 : auto:     0.7692(d) dot:     0.7471 gus:     0.8046 heap:     1.4554 MATLAB     0.2138 speedup auto:       0.28 dot:       0.29 gus:       0.27 heap:       0.15
Elapsed time is 0.073647 seconds.

------------------------------ C = A'*x'
       3 : auto:     0.6619(h) dot:     0.0049 gus:     0.6314 heap:     0.6277 MATLAB     0.0767 speedup auto:       0.12 dot:      15.70 gus:       0.12 heap:       0.12
     300 : auto:     0.6856(h) dot:     0.1577 gus:     0.6612 heap:     0.6582 MATLAB     0.0947 speedup auto:       0.14 dot:       0.60 gus:       0.14 heap:       0.14
    2986 : auto:     0.6899(G) dot:     0.2000 gus:     0.6543 heap:     0.6698 MATLAB     0.1090 speedup auto:       0.16 dot:       0.54 gus:       0.17 heap:       0.16
   14466 : auto:     0.6879(G) dot:     0.2318 gus:     0.6643 heap:     0.8390 MATLAB     0.1192 speedup auto:       0.17 dot:       0.51 gus:       0.18 heap:       0.14
  136569 : auto:     0.7755(G) dot:     0.4224 gus:     0.7765 heap:     2.2454 MATLAB     0.2631 speedup auto:       0.34 dot:       0.62 gus:       0.34 heap:       0.12
  216000 : auto:     0.1898(d) dot:     0.1675 gus:     0.8749 heap:     3.2705 MATLAB     0.2407 speedup auto:       1.27 dot:       1.44 gus:       0.28 heap:       0.07
Elapsed time is 0.069766 seconds.

================ ncols 4

------------------------------ C = A'*x
       4 : auto:     0.6873(h) dot:     0.0068 gus:     0.6445 heap:     0.6864 MATLAB     0.6410 speedup auto:       0.93 dot:      94.28 gus:       0.99 heap:       0.93
     400 : auto:     0.6794(h) dot:     0.1914 gus:     0.6510 heap:     0.6466 MATLAB     0.6443 speedup auto:       0.95 dot:       3.37 gus:       0.99 heap:       1.00
    3985 : auto:     0.6851(G) dot:     0.2696 gus:     0.6846 heap:     0.6674 MATLAB     0.7100 speedup auto:       1.04 dot:       2.63 gus:       1.04 heap:       1.06
   19339 : auto:     0.6933(G) dot:     0.3227 gus:     0.6608 heap:     0.8553 MATLAB     0.6903 speedup auto:       1.00 dot:       2.14 gus:       1.04 heap:       0.81
  181845 : auto:     0.8167(G) dot:     0.5798 gus:     0.8351 heap:     2.8184 MATLAB     0.8512 speedup auto:       1.04 dot:       1.47 gus:       1.02 heap:       0.30
  288000 : auto:     0.2587(d) dot:     0.2147 gus:     0.8881 heap:     4.1370 MATLAB     0.9331 speedup auto:       3.61 dot:       4.35 gus:       1.05 heap:       0.23
Elapsed time is 0.059226 seconds.

------------------------------ C = A*x
       4 : auto:     0.0003(h) dot:     0.6782 gus:     0.0001 heap:     0.0001 MATLAB     0.0005 speedup auto:       1.79 dot:       0.00 gus:       3.72 heap:       5.24
     399 : auto:     0.0029(h) dot:     0.8551 gus:     0.0050 heap:     0.0032 MATLAB     0.0062 speedup auto:       2.16 dot:       0.01 gus:       1.23 heap:       1.95
    3977 : auto:     0.0184(G) dot:     0.9245 gus:     0.0165 heap:     0.0425 MATLAB     0.0192 speedup auto:       1.04 dot:       0.02 gus:       1.16 heap:       0.45
   19305 : auto:     0.0435(G) dot:     0.9687 gus:     0.0283 heap:     0.2161 MATLAB     0.0414 speedup auto:       0.95 dot:       0.04 gus:       1.46 heap:       0.19
  181976 : auto:     0.1710(G) dot:     1.2306 gus:     0.1797 heap:     2.2095 MATLAB     0.2101 speedup auto:       1.23 dot:       0.17 gus:       1.17 heap:       0.10
  288000 : auto:     0.2775(G) dot:     0.8510 gus:     0.2713 heap:     3.5170 MATLAB     0.2870 speedup auto:       1.03 dot:       0.34 gus:       1.06 heap:       0.08
Elapsed time is 0.076901 seconds.

------------------------------ C = x'*A
       4 : auto:     0.0768(G) dot:     0.0075 gus:     0.0648 heap:     0.0680 MATLAB     0.0725 speedup auto:       0.94 dot:       9.68 gus:       1.12 heap:       1.07
     399 : auto:     0.5579(G) dot:     0.1078 gus:     0.5602 heap:     0.2769 MATLAB     0.0837 speedup auto:       0.15 dot:       0.78 gus:       0.15 heap:       0.30
    3975 : auto:     0.6633(G) dot:     0.3354 gus:     0.6445 heap:     0.5087 MATLAB     0.1515 speedup auto:       0.23 dot:       0.45 gus:       0.24 heap:       0.30
   19348 : auto:     0.1485(G) dot:     0.5642 gus:     0.1329 heap:     0.3537 MATLAB     0.1410 speedup auto:       0.95 dot:       0.25 gus:       1.06 heap:       0.40
  182157 : auto:     0.2426(G) dot:     1.7546 gus:     0.2425 heap:     1.4917 MATLAB     0.3068 speedup auto:       1.26 dot:       0.17 gus:       1.27 heap:       0.21
  288000 : auto:     0.1654(d) dot:     0.1579 gus:     0.2217 heap:     0.9587 MATLAB     0.2583 speedup auto:       1.56 dot:       1.64 gus:       1.17 heap:       0.27
Elapsed time is 0.069093 seconds.

------------------------------ C = x'*A'
       4 : auto:     0.7232(G) dot:     0.6513 gus:     0.6944 heap:     0.6987 MATLAB     0.0004 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
     398 : auto:     1.2270(G) dot:     0.7480 gus:     1.1863 heap:     0.9306 MATLAB     0.0053 speedup auto:       0.00 dot:       0.01 gus:       0.00 heap:       0.01
    3970 : auto:     1.2887(G) dot:     0.9871 gus:     1.2849 heap:     1.1380 MATLAB     0.0156 speedup auto:       0.01 dot:       0.02 gus:       0.01 heap:       0.01
   19322 : auto:     0.8149(G) dot:     1.2109 gus:     0.7631 heap:     1.0055 MATLAB     0.0371 speedup auto:       0.05 dot:       0.03 gus:       0.05 heap:       0.04
  181797 : auto:     0.9049(G) dot:     2.3601 gus:     0.8885 heap:     2.1650 MATLAB     0.2090 speedup auto:       0.23 dot:       0.09 gus:       0.24 heap:       0.10
  288000 : auto:     0.8177(d) dot:     0.7789 gus:     0.8542 heap:     1.6026 MATLAB     0.3102 speedup auto:       0.38 dot:       0.40 gus:       0.36 heap:       0.19
Elapsed time is 0.087504 seconds.

------------------------------ C = A'*x'
       4 : auto:     0.6487(h) dot:     0.0065 gus:     0.6492 heap:     0.6334 MATLAB     0.0765 speedup auto:       0.12 dot:      11.67 gus:       0.12 heap:       0.12
     400 : auto:     0.6621(h) dot:     0.1990 gus:     0.6435 heap:     0.6487 MATLAB     0.0919 speedup auto:       0.14 dot:       0.46 gus:       0.14 heap:       0.14
    3973 : auto:     0.7031(G) dot:     0.2551 gus:     0.6614 heap:     0.6789 MATLAB     0.1111 speedup auto:       0.16 dot:       0.44 gus:       0.17 heap:       0.16
   19334 : auto:     0.6883(G) dot:     0.3046 gus:     0.6697 heap:     0.8538 MATLAB     0.1436 speedup auto:       0.21 dot:       0.47 gus:       0.21 heap:       0.17
  182188 : auto:     0.8313(G) dot:     0.6184 gus:     0.8216 heap:     2.8117 MATLAB     0.3102 speedup auto:       0.37 dot:       0.50 gus:       0.38 heap:       0.11
  288000 : auto:     0.2495(d) dot:     0.2194 gus:     0.9614 heap:     4.1176 MATLAB     0.3143 speedup auto:       1.26 dot:       1.43 gus:       0.33 heap:       0.08
Elapsed time is 0.060121 seconds.

================ ncols 8

------------------------------ C = A'*x
       8 : auto:     0.6680(h) dot:     0.0116 gus:     0.6770 heap:     0.6440 MATLAB     0.6352 speedup auto:       0.95 dot:      54.94 gus:       0.94 heap:       0.99
     798 : auto:     0.6862(h) dot:     0.3916 gus:     0.6657 heap:     0.6865 MATLAB     0.6559 speedup auto:       0.96 dot:       1.67 gus:       0.99 heap:       0.96
    7946 : auto:     0.7189(G) dot:     0.5343 gus:     0.6945 heap:     0.7463 MATLAB     0.7134 speedup auto:       0.99 dot:       1.34 gus:       1.03 heap:       0.96
   38592 : auto:     0.7393(G) dot:     0.6374 gus:     0.7187 heap:     1.1455 MATLAB     0.7671 speedup auto:       1.04 dot:       1.20 gus:       1.07 heap:       0.67
  364457 : auto:     1.0093(G) dot:     1.1753 gus:     1.0103 heap:     5.0440 MATLAB     1.1378 speedup auto:       1.13 dot:       0.97 gus:       1.13 heap:       0.23
  576000 : auto:     0.4833(d) dot:     0.4493 gus:     1.1388 heap:     7.7113 MATLAB     1.2771 speedup auto:       2.64 dot:       2.84 gus:       1.12 heap:       0.17
Elapsed time is 0.123166 seconds.

------------------------------ C = A*x
       8 : auto:     0.0004(h) dot:     0.6809 gus:     0.0004 heap:     0.0004 MATLAB     0.0012 speedup auto:       2.77 dot:       0.00 gus:       2.82 heap:       3.09
     800 : auto:     0.0068(h) dot:     1.0534 gus:     0.0086 heap:     0.0056 MATLAB     0.0108 speedup auto:       1.57 dot:       0.01 gus:       1.25 heap:       1.94
    7946 : auto:     0.0463(G) dot:     1.1930 gus:     0.0357 heap:     0.0884 MATLAB     0.0384 speedup auto:       0.83 dot:       0.03 gus:       1.07 heap:       0.43
   38581 : auto:     0.0838(G) dot:     1.2738 gus:     0.0569 heap:     0.4548 MATLAB     0.0914 speedup auto:       1.09 dot:       0.07 gus:       1.61 heap:       0.20
  364616 : auto:     0.3705(G) dot:     1.8052 gus:     0.3548 heap:     4.3830 MATLAB     0.4429 speedup auto:       1.20 dot:       0.25 gus:       1.25 heap:       0.10
  576000 : auto:     0.5201(G) dot:     1.1177 gus:     0.5084 heap:     7.2006 MATLAB     0.9128 speedup auto:       1.75 dot:       0.82 gus:       1.80 heap:       0.13
Elapsed time is 0.182837 seconds.

------------------------------ C = x'*A
       8 : auto:     0.0941(G) dot:     0.0081 gus:     0.0704 heap:     0.0903 MATLAB     0.0973 speedup auto:       1.03 dot:      12.08 gus:       1.38 heap:       1.08
     800 : auto:     0.9526(G) dot:     0.1509 gus:     0.6617 heap:     0.3502 MATLAB     0.1034 speedup auto:       0.11 dot:       0.69 gus:       0.16 heap:       0.30
    7949 : auto:     0.8496(G) dot:     0.6618 gus:     0.8473 heap:     0.7228 MATLAB     0.1619 speedup auto:       0.19 dot:       0.24 gus:       0.19 heap:       0.22
   38561 : auto:     0.2298(G) dot:     1.2662 gus:     0.2021 heap:     0.7726 MATLAB     0.2493 speedup auto:       1.09 dot:       0.20 gus:       1.23 heap:       0.32
  364005 : auto:     0.4176(G) dot:     4.0553 gus:     0.3613 heap:     2.8990 MATLAB     0.4248 speedup auto:       1.02 dot:       0.10 gus:       1.18 heap:       0.15
  576000 : auto:     0.3002(d) dot:     0.2692 gus:     0.3270 heap:     1.7369 MATLAB     0.4106 speedup auto:       1.37 dot:       1.53 gus:       1.26 heap:       0.24
Elapsed time is 0.142964 seconds.

------------------------------ C = x'*A'
       8 : auto:     0.7685(G) dot:     0.6483 gus:     0.6852 heap:     0.6880 MATLAB     0.0006 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
     800 : auto:     1.3241(G) dot:     0.7730 gus:     1.2821 heap:     1.0395 MATLAB     0.0102 speedup auto:       0.01 dot:       0.01 gus:       0.01 heap:       0.01
    7933 : auto:     1.4485(G) dot:     1.3512 gus:     1.4320 heap:     1.3562 MATLAB     0.0399 speedup auto:       0.03 dot:       0.03 gus:       0.03 heap:       0.03
   38665 : auto:     0.8530(G) dot:     1.8200 gus:     0.8441 heap:     1.3868 MATLAB     0.0876 speedup auto:       0.10 dot:       0.05 gus:       0.10 heap:       0.06
  363763 : auto:     1.0066(G) dot:     4.7446 gus:     0.9923 heap:     3.5764 MATLAB     0.4529 speedup auto:       0.45 dot:       0.10 gus:       0.46 heap:       0.13
  576000 : auto:     0.9466(d) dot:     0.9258 gus:     0.9742 heap:     2.4135 MATLAB     0.6005 speedup auto:       0.63 dot:       0.65 gus:       0.62 heap:       0.25
Elapsed time is 0.158018 seconds.

------------------------------ C = A'*x'
       8 : auto:     0.6555(h) dot:     0.0128 gus:     0.6299 heap:     0.6597 MATLAB     0.0769 speedup auto:       0.12 dot:       6.00 gus:       0.12 heap:       0.12
     798 : auto:     0.6509(h) dot:     0.3801 gus:     0.6666 heap:     0.6328 MATLAB     0.1101 speedup auto:       0.17 dot:       0.29 gus:       0.17 heap:       0.17
    7942 : auto:     0.7430(G) dot:     0.5265 gus:     0.6682 heap:     0.7222 MATLAB     0.1402 speedup auto:       0.19 dot:       0.27 gus:       0.21 heap:       0.19
   38616 : auto:     0.7135(G) dot:     0.6343 gus:     0.6997 heap:     1.0841 MATLAB     0.2306 speedup auto:       0.32 dot:       0.36 gus:       0.33 heap:       0.21
  363820 : auto:     1.0246(G) dot:     1.1490 gus:     1.0712 heap:     5.0560 MATLAB     0.4454 speedup auto:       0.43 dot:       0.39 gus:       0.42 heap:       0.09
  576000 : auto:     0.4775(d) dot:     0.4629 gus:     1.1844 heap:     8.1912 MATLAB     0.4351 speedup auto:       0.91 dot:       0.94 gus:       0.37 heap:       0.05
Elapsed time is 0.132086 seconds.

================ ncols 16

------------------------------ C = A'*x
      16 : auto:     0.6667(h) dot:     0.0237 gus:     0.6503 heap:     0.6563 MATLAB     0.6408 speedup auto:       0.96 dot:      27.06 gus:       0.99 heap:       0.98
    1600 : auto:     0.6893(h) dot:     0.7811 gus:     0.6503 heap:     0.6498 MATLAB     0.6619 speedup auto:       0.96 dot:       0.85 gus:       1.02 heap:       1.02
   15886 : auto:     0.7493(G) dot:     1.0774 gus:     0.7208 heap:     0.8224 MATLAB     0.7219 speedup auto:       0.96 dot:       0.67 gus:       1.00 heap:       0.88
   77255 : auto:     0.7801(G) dot:     1.2697 gus:     0.7932 heap:     1.5174 MATLAB     0.8213 speedup auto:       1.05 dot:       0.65 gus:       1.04 heap:       0.54
  728680 : auto:     1.3722(G) dot:     2.3504 gus:     1.3778 heap:     9.4479 MATLAB     1.4849 speedup auto:       1.08 dot:       0.63 gus:       1.08 heap:       0.16
 1152000 : auto:     0.9205(d) dot:     0.9067 gus:     1.6331 heap:    14.6548 MATLAB     1.8432 speedup auto:       2.00 dot:       2.03 gus:       1.13 heap:       0.13
Elapsed time is 0.213791 seconds.

------------------------------ C = A*x
      16 : auto:     0.0012(h) dot:     0.7039 gus:     0.0006 heap:     0.0012 MATLAB     0.0028 speedup auto:       2.33 dot:       0.00 gus:       4.46 heap:       2.44
    1599 : auto:     0.0127(h) dot:     1.4819 gus:     0.0201 heap:     0.0112 MATLAB     0.0233 speedup auto:       1.83 dot:       0.02 gus:       1.16 heap:       2.09
   15890 : auto:     0.0887(G) dot:     1.6952 gus:     0.0820 heap:     0.1819 MATLAB     0.0800 speedup auto:       0.90 dot:       0.05 gus:       0.98 heap:       0.44
   77279 : auto:     0.1497(G) dot:     1.9523 gus:     0.1157 heap:     0.8575 MATLAB     0.1704 speedup auto:       1.14 dot:       0.09 gus:       1.47 heap:       0.20
  728733 : auto:     0.7091(G) dot:     2.9451 gus:     0.7174 heap:     8.7291 MATLAB     0.8799 speedup auto:       1.24 dot:       0.30 gus:       1.23 heap:       0.10
 1152000 : auto:     1.0131(G) dot:     1.5130 gus:     1.0075 heap:    14.2918 MATLAB     1.1929 speedup auto:       1.18 dot:       0.79 gus:       1.18 heap:       0.08
Elapsed time is 0.253152 seconds.

------------------------------ C = x'*A
      16 : auto:     0.0742(G) dot:     0.0122 gus:     0.0667 heap:     0.0572 MATLAB     0.0707 speedup auto:       0.95 dot:       5.79 gus:       1.06 heap:       1.24
    1597 : auto:     0.8151(G) dot:     0.2081 gus:     0.7740 heap:     0.4033 MATLAB     0.1303 speedup auto:       0.16 dot:       0.63 gus:       0.17 heap:       0.32
   15884 : auto:     0.1878(G) dot:     1.1501 gus:     0.1755 heap:     0.3995 MATLAB     0.2164 speedup auto:       1.15 dot:       0.19 gus:       1.23 heap:       0.54
   77218 : auto:     0.2764(G) dot:     2.5066 gus:     0.2347 heap:     1.4256 MATLAB     0.3246 speedup auto:       1.17 dot:       0.13 gus:       1.38 heap:       0.23
  728212 : auto:     0.5013(G) dot:     9.5158 gus:     0.4943 heap:     5.7738 MATLAB     0.6560 speedup auto:       1.31 dot:       0.07 gus:       1.33 heap:       0.11
 1152000 : auto:     0.5643(d) dot:     0.5612 gus:     0.5684 heap:     3.2503 MATLAB     0.7202 speedup auto:       1.28 dot:       1.28 gus:       1.27 heap:       0.22
Elapsed time is 0.276019 seconds.

------------------------------ C = x'*A'
      16 : auto:     0.7121(G) dot:     0.6485 gus:     0.7008 heap:     0.7106 MATLAB     0.0011 speedup auto:       0.00 dot:       0.00 gus:       0.00 heap:       0.00
    1599 : auto:     1.4156(G) dot:     0.8804 gus:     1.4571 heap:     1.0536 MATLAB     0.0248 speedup auto:       0.02 dot:       0.03 gus:       0.02 heap:       0.02
   15894 : auto:     0.8367(G) dot:     1.7659 gus:     0.8246 heap:     1.0596 MATLAB     0.0835 speedup auto:       0.10 dot:       0.05 gus:       0.10 heap:       0.08
   77374 : auto:     0.9278(G) dot:     3.1641 gus:     0.9240 heap:     2.0323 MATLAB     0.1791 speedup auto:       0.19 dot:       0.06 gus:       0.19 heap:       0.09
  727912 : auto:     1.1457(G) dot:    10.1314 gus:     1.1449 heap:     6.4800 MATLAB     0.9390 speedup auto:       0.82 dot:       0.09 gus:       0.82 heap:       0.14
 1152000 : auto:     1.1897(d) dot:     1.1825 gus:     1.1986 heap:     3.9426 MATLAB     1.2287 speedup auto:       1.03 dot:       1.04 gus:       1.03 heap:       0.31
Elapsed time is 0.326673 seconds.

------------------------------ C = A'*x'
      16 : auto:     0.6650(h) dot:     0.0231 gus:     0.6791 heap:     0.6549 MATLAB     0.0845 speedup auto:       0.13 dot:       3.67 gus:       0.12 heap:       0.13
    1600 : auto:     0.6681(h) dot:     0.8220 gus:     0.7316 heap:     0.7291 MATLAB     0.1367 speedup auto:       0.20 dot:       0.17 gus:       0.19 heap:       0.19
   15878 : auto:     0.7510(G) dot:     1.0888 gus:     0.7327 heap:     0.8119 MATLAB     0.2127 speedup auto:       0.28 dot:       0.20 gus:       0.29 heap:       0.26
   77286 : auto:     0.7775(G) dot:     1.3319 gus:     0.7909 heap:     1.6312 MATLAB     0.3078 speedup auto:       0.40 dot:       0.23 gus:       0.39 heap:       0.19
  728384 : auto:     1.3870(G) dot:     2.4313 gus:     1.3941 heap:     9.3735 MATLAB     0.6785 speedup auto:       0.49 dot:       0.28 gus:       0.49 heap:       0.07
 1152000 : auto:     0.9365(d) dot:     0.9222 gus:     1.7085 heap:    14.7107 MATLAB     0.7183 speedup auto:       0.77 dot:       0.78 gus:       0.42 heap:       0.05
Elapsed time is 0.240359 seconds.

Prob = 

  struct with fields:

      name: 'Freescale/Freescale2'
     title: 'circuit simulation matrix from Freescale'
         A: [29993492999349 double]
     Zeros: [29993492999349 double]
        id: 2662
      date: '2015'
    author: 'K. Gullapalli'
        ed: 'T. Davis'
      kind: 'circuit simulation matrix'
     notes: [459 char]


================ ncols 1

------------------------------ C = A'*x
       1 : auto:     0.0095(d) dot:     0.0170 gus:     0.3275 heap:     0.2273 MATLAB     0.2176 speedup auto:      22.80 dot:      12.83 gus:       0.66 heap:       0.96
     100 : auto:     0.0708(d) dot:     0.0593 gus:     0.2471 heap:     0.2207 MATLAB     0.2321 speedup auto:       3.28 dot:       3.91 gus:       0.94 heap:       1.05
    1000 : auto:     0.0638(d) dot:     0.0574 gus:     0.2400 heap:     0.2240 MATLAB     0.2269 speedup auto:       3.56 dot:       3.95 gus:       0.95 heap:       1.01
    4993 : auto:     0.0656(d) dot:     0.0637 gus:     0.2190 heap:     0.2253 MATLAB     0.2313 speedup auto:       3.52 dot:       3.63 gus:       1.06 heap:       1.03
   71145 : auto:     0.0884(d) dot:     0.0702 gus:     0.3518 heap:     0.2451 MATLAB     0.2634 speedup auto:       2.98 dot:       3.75 gus:       0.75 heap:       1.07
 2999349 : auto:     0.0691(d) dot:     0.0745 gus:     0.4362 heap:     1.6621 MATLAB     0.4474 speedup auto:       6.47 dot:       6.01 gus:       1.03 heap:       0.27
Elapsed time is 0.057655 seconds.

------------------------------ C = A*x
       1 : auto:     0.0036(h) dot:     0.3505 gus:     0.0078 heap:     0.0039 MATLAB     0.0127 speedup auto:       3.54 dot:       0.04 gus:       1.64 heap:       3.29
     100 : auto:     0.0037(h) dot:     0.2964 gus:     0.0094 heap:     0.0040 MATLAB     0.0107 speedup auto:       2.94 dot:       0.04 gus:       1.15 heap:       2.70
    1000 : auto:     0.0042(h) dot:     0.2860 gus:     0.0096 heap:     0.0040 MATLAB     0.0103 speedup auto:       2.43 dot:       0.04 gus:       1.08 heap:       2.57
    4999 : auto:     0.0057(h) dot:     0.2906 gus:     0.0150 heap:     0.0059 MATLAB     0.0157 speedup auto:       2.77 dot:       0.05 gus:       1.05 heap:       2.68
   71108 : auto:     0.0829(G) dot:     0.3636 gus:     0.0449 heap:     0.0381 MATLAB     0.0488 speedup auto:       0.59 dot:       0.13 gus:       1.09 heap:       1.28
 2999349 : auto:     0.1349(G) dot:     0.3938 gus:     0.1018 heap:     1.4301 MATLAB     0.2401 speedup auto:       1.78 dot:       0.61 gus:       2.36 heap:       0.17
Elapsed time is 0.075915 seconds.

------------------------------ C = x'*A
       1 : auto:     0.0606(d) dot:     0.0439 gus:     0.1057 heap:     0.0600 MATLAB     0.0879 speedup auto:       1.45 dot:       2.00 gus:       0.83 heap:       1.46
     100 : auto:     0.2169(d) dot:     0.2033 gus:     0.2408 heap:     0.1989 MATLAB     0.0899 speedup auto:       0.41 dot:       0.44 gus:       0.37 heap:       0.45
    1000 : auto:     0.3119(d) dot:     0.3000 gus:     0.3597 heap:     0.3016 MATLAB     0.0827 speedup auto:       0.27 dot:       0.28 gus:       0.23 heap:       0.27
    4992 : auto:     0.4347(d) dot:     0.4335 gus:     0.4319 heap:     0.3652 MATLAB     0.0864 speedup auto:       0.20 dot:       0.20 gus:       0.20 heap:       0.24
   71164 : auto:     0.6709(d) dot:     0.6382 gus:     0.7251 heap:     0.6011 MATLAB     0.1059 speedup auto:       0.16 dot:       0.17 gus:       0.15 heap:       0.18
 2999349 : auto:     0.1120(d) dot:     0.1136 gus:     0.2283 heap:     0.3088 MATLAB     0.1771 speedup auto:       1.58 dot:       1.56 gus:       0.78 heap:       0.57
Elapsed time is 0.140396 seconds.

------------------------------ C = x'*A'
       1 : auto:     0.3644(d) dot:     0.2570 gus:     0.3074 heap:     0.2620 MATLAB     0.0137 speedup auto:       0.04 dot:       0.05 gus:       0.04 heap:       0.05
     100 : auto:     0.4262(d) dot:     0.4030 gus:     0.4486 heap:     0.4170 MATLAB     0.0145 speedup auto:       0.03 dot:       0.04 gus:       0.03 heap:       0.03
     999 : auto:     0.5562(d) dot:     0.5209 gus:     0.5423 heap:     0.5006 MATLAB     0.0184 speedup auto:       0.03 dot:       0.04 gus:       0.03 heap:       0.04
    4999 : auto:     0.6358(d) dot:     0.6253 gus:     0.6377 heap:     0.5815 MATLAB     0.0177 speedup auto:       0.03 dot:       0.03 gus:       0.03 heap:       0.03
   71097 : auto:     0.9914(d) dot:     0.8669 gus:     0.9244 heap:     0.8316 MATLAB     0.0472 speedup auto:       0.05 dot:       0.05 gus:       0.05 heap:       0.06
 2999349 : auto:     0.4178(d) dot:     0.3089 gus:     0.4255 heap:     0.5274 MATLAB     0.1824 speedup auto:       0.44 dot:       0.59 gus:       0.43 heap:       0.35
Elapsed time is 0.117433 seconds.

------------------------------ C = A'*x'
       1 : auto:     0.0290(d) dot:     0.0254 gus:     0.3274 heap:     0.2174 MATLAB     0.0808 speedup auto:       2.79 dot:       3.18 gus:       0.25 heap:       0.37
     100 : auto:     0.0718(d) dot:     0.0556 gus:     0.2159 heap:     0.2084 MATLAB     0.0812 speedup auto:       1.13 dot:       1.46 gus:       0.38 heap:       0.39
    1000 : auto:     0.0755(d) dot:     0.0588 gus:     0.2123 heap:     0.2126 MATLAB     0.0859 speedup auto:       1.14 dot:       1.46 gus:       0.40 heap:       0.40
    4995 : auto:     0.0827(d) dot:     0.0627 gus:     0.2221 heap:     0.2147 MATLAB     0.0855 speedup auto:       1.03 dot:       1.36 gus:       0.38 heap:       0.40
   71090 : auto:     0.0924(d) dot:     0.0782 gus:     0.3631 heap:     0.2470 MATLAB     0.0971 speedup auto:       1.05 dot:       1.24 gus:       0.27 heap:       0.39
 2999349 : auto:     0.0734(d) dot:     0.0836 gus:     0.4485 heap:     1.6979 MATLAB     0.1732 speedup auto:       2.36 dot:       2.07 gus:       0.39 heap:       0.10
Elapsed time is 0.102219 seconds.

================ ncols 2

------------------------------ C = A'*x
       2 : auto:     0.3271(h) dot:     0.0215 gus:     0.2188 heap:     0.2102 MATLAB     0.2315 speedup auto:       0.71 dot:      10.75 gus:       1.06 heap:       1.10
     200 : auto:     0.2408(h) dot:     0.0968 gus:     0.2142 heap:     0.2061 MATLAB     0.2151 speedup auto:       0.89 dot:       2.22 gus:       1.00 heap:       1.04
    2000 : auto:     0.2333(h) dot:     0.1010 gus:     0.2185 heap:     0.2125 MATLAB     0.2250 speedup auto:       0.96 dot:       2.23 gus:       1.03 heap:       1.06
    9996 : auto:     0.2384(h) dot:     0.1029 gus:     0.2170 heap:     0.2158 MATLAB     0.2432 speedup auto:       1.02 dot:       2.36 gus:       1.12 heap:       1.13
  142265 : auto:     0.4225(G) dot:     0.1344 gus:     0.2862 heap:     0.2766 MATLAB     0.2878 speedup auto:       0.68 dot:       2.14 gus:       1.01 heap:       1.04
 5998698 : auto:     0.1367(d) dot:     0.1641 gus:     0.5665 heap:     3.1490 MATLAB     0.7298 speedup auto:       5.34 dot:       4.45 gus:       1.29 heap:       0.23
Elapsed time is 0.123203 seconds.

------------------------------ C = A*x
       2 : auto:     0.0069(h) dot:     0.3508 gus:     0.0113 heap:     0.0067 MATLAB     0.0196 speedup auto:       2.86 dot:       0.06 gus:       1.73 heap:       2.92
     200 : auto:     0.0069(h) dot:     0.3280 gus:     0.0108 heap:     0.0083 MATLAB     0.0124 speedup auto:       1.79 dot:       0.04 gus:       1.14 heap:       1.49
    2000 : auto:     0.0077(h) dot:     0.3273 gus:     0.0119 heap:     0.0075 MATLAB     0.0157 speedup auto:       2.03 dot:       0.05 gus:       1.32 heap:       2.10
    9994 : auto:     0.0105(h) dot:     0.3319 gus:     0.0190 heap:     0.0105 MATLAB     0.0231 speedup auto:       2.19 dot:       0.07 gus:       1.21 heap:       2.20
  142287 : auto:     0.1137(G) dot:     0.4333 gus:     0.0896 heap:     0.0730 MATLAB     0.0932 speedup auto:       0.82 dot:       0.21 gus:       1.04 heap:       1.28
 5998698 : auto:     0.2577(G) dot:     0.4448 gus:     0.1978 heap:     2.7806 MATLAB     0.4671 speedup auto:       1.81 dot:       1.05 gus:       2.36 heap:       0.17
Elapsed time is 0.127689 seconds.

------------------------------ C = x'*A
       2 : auto:     0.1228(G) dot:     0.0555 gus:     0.1073 heap:     0.0702 MATLAB     0.0898 speedup auto:       0.73 dot:       1.62 gus:       0.84 heap:       1.28
     200 : auto:     0.3039(G) dot:     0.3572 gus:     0.2610 heap:     0.2254 MATLAB     0.0900 speedup auto:       0.30 dot:       0.25 gus:       0.34 heap:       0.40
    2000 : auto:     0.3931(G) dot:     0.5638 gus:     0.3768 heap:     0.3203 MATLAB     0.1009 speedup auto:       0.26 dot:       0.18 gus:       0.27 heap:       0.32
    9988 : auto:     0.5080(G) dot:     0.7912 gus:     0.4992 heap:     0.4372 MATLAB     0.0966 speedup auto:       0.19 dot:       0.12 gus:       0.19 heap:       0.22
  142232 : auto:     0.9535(G) dot:     1.3117 gus:     0.9209 heap:     0.7064 MATLAB     0.1314 speedup auto:       0.14 dot:       0.10 gus:       0.14 heap:       0.19
 5998698 : auto:     0.1606(d) dot:     0.1896 gus:     0.4129 heap:     0.4802 MATLAB     0.2827 speedup auto:       1.76 dot:       1.49 gus:       0.68 heap:       0.59
Elapsed time is 0.192230 seconds.

------------------------------ C = x'*A'
       2 : auto:     0.4305(G) dot:     0.2630 gus:     0.3050 heap:     0.2775 MATLAB     0.0165 speedup auto:       0.04 dot:       0.06 gus:       0.05 heap:       0.06
     200 : auto:     0.5089(G) dot:     0.5570 gus:     0.4799 heap:     0.4303 MATLAB     0.0141 speedup auto:       0.03 dot:       0.03 gus:       0.03 heap:       0.03
    2000 : auto:     0.6174(G) dot:     0.7649 gus:     0.5787 heap:     0.5474 MATLAB     0.0153 speedup auto:       0.02 dot:       0.02 gus:       0.03 heap:       0.03
    9987 : auto:     0.7138(G) dot:     1.0029 gus:     0.7142 heap:     0.6272 MATLAB     0.0237 speedup auto:       0.03 dot:       0.02 gus:       0.03 heap:       0.04
  142217 : auto:     1.2741(G) dot:     1.5542 gus:     1.1270 heap:     0.9492 MATLAB     0.0851 speedup auto:       0.07 dot:       0.05 gus:       0.08 heap:       0.09
 5998698 : auto:     0.4610(d) dot:     0.3994 gus:     0.5959 heap:     0.7094 MATLAB     0.3535 speedup auto:       0.77 dot:       0.89 gus:       0.59 heap:       0.50
Elapsed time is 0.188580 seconds.

------------------------------ C = A'*x'
       2 : auto:     0.3305(h) dot:     0.0341 gus:     0.2218 heap:     0.2224 MATLAB     0.0821 speedup auto:       0.25 dot:       2.40 gus:       0.37 heap:       0.37
     200 : auto:     0.2344(h) dot:     0.1197 gus:     0.2221 heap:     0.2235 MATLAB     0.0773 speedup auto:       0.33 dot:       0.65 gus:       0.35 heap:       0.35
    1999 : auto:     0.2419(h) dot:     0.1168 gus:     0.2220 heap:     0.2235 MATLAB     0.0997 speedup auto:       0.41 dot:       0.85 gus:       0.45 heap:       0.45
    9996 : auto:     0.2392(h) dot:     0.1156 gus:     0.2316 heap:     0.2296 MATLAB     0.0847 speedup auto:       0.35 dot:       0.73 gus:       0.37 heap:       0.37
  142282 : auto:     0.4628(G) dot:     0.1420 gus:     0.2979 heap:     0.2872 MATLAB     0.1181 speedup auto:       0.26 dot:       0.83 gus:       0.40 heap:       0.41
 5998698 : auto:     0.2005(d) dot:     0.1972 gus:     0.6350 heap:     3.0203 MATLAB     0.2464 speedup auto:       1.23 dot:       1.25 gus:       0.39 heap:       0.08
Elapsed time is 0.172236 seconds.

================ ncols 3

------------------------------ C = A'*x
       3 : auto:     0.3264(h) dot:     0.0416 gus:     0.3241 heap:     0.2140 MATLAB     0.2326 speedup auto:       0.71 dot:       5.59 gus:       0.72 heap:       1.09
     300 : auto:     0.2328(h) dot:     0.1500 gus:     0.2243 heap:     0.2097 MATLAB     0.2225 speedup auto:       0.96 dot:       1.48 gus:       0.99 heap:       1.06
    3000 : auto:     0.2329(h) dot:     0.1609 gus:     0.2169 heap:     0.2138 MATLAB     0.2247 speedup auto:       0.96 dot:       1.40 gus:       1.04 heap:       1.05
   14991 : auto:     0.2463(h) dot:     0.1596 gus:     0.2325 heap:     0.2133 MATLAB     0.2520 speedup auto:       1.02 dot:       1.58 gus:       1.08 heap:       1.18
  213400 : auto:     0.4586(G) dot:     0.2086 gus:     0.3404 heap:     0.3114 MATLAB     0.3333 speedup auto:       0.73 dot:       1.60 gus:       0.98 heap:       1.07
 8998047 : auto:     0.1908(d) dot:     0.2245 gus:     0.6072 heap:     4.4123 MATLAB     0.9714 speedup auto:       5.09 dot:       4.33 gus:       1.60 heap:       0.22
Elapsed time is 0.174603 seconds.

------------------------------ C = A*x
       3 : auto:     0.0099(h) dot:     0.2818 gus:     0.0112 heap:     0.0100 MATLAB     0.0181 speedup auto:       1.82 dot:       0.06 gus:       1.62 heap:       1.81
     300 : auto:     0.0102(h) dot:     0.3777 gus:     0.0115 heap:     0.0101 MATLAB     0.0098 speedup auto:       0.96 dot:       0.03 gus:       0.85 heap:       0.97
    3000 : auto:     0.0112(h) dot:     0.3969 gus:     0.0162 heap:     0.0111 MATLAB     0.0136 speedup auto:       1.22 dot:       0.03 gus:       0.84 heap:       1.22
   14995 : auto:     0.0196(h) dot:     0.3646 gus:     0.0249 heap:     0.0160 MATLAB     0.0237 speedup auto:       1.21 dot:       0.06 gus:       0.95 heap:       1.48
  213511 : auto:     0.1434(G) dot:     0.5112 gus:     0.1280 heap:     0.1061 MATLAB     0.1212 speedup auto:       0.85 dot:       0.24 gus:       0.95 heap:       1.14
 8998047 : auto:     0.3156(G) dot:     0.5553 gus:     0.3031 heap:     4.1168 MATLAB     0.6971 speedup auto:       2.21 dot:       1.26 gus:       2.30 heap:       0.17
Elapsed time is 0.191939 seconds.

------------------------------ C = x'*A
       3 : auto:     0.1104(G) dot:     0.0709 gus:     0.1002 heap:     0.0686 MATLAB     0.1007 speedup auto:       0.91 dot:       1.42 gus:       1.01 heap:       1.47
     300 : auto:     0.2972(G) dot:     0.5089 gus:     0.2795 heap:     0.2478 MATLAB     0.0936 speedup auto:       0.31 dot:       0.18 gus:       0.33 heap:       0.38
    2999 : auto:     0.4189(G) dot:     0.8300 gus:     0.4131 heap:     0.3596 MATLAB     0.0903 speedup auto:       0.22 dot:       0.11 gus:       0.22 heap:       0.25
   14989 : auto:     0.5233(G) dot:     1.1482 gus:     0.5384 heap:     0.4701 MATLAB     0.0930 speedup auto:       0.18 dot:       0.08 gus:       0.17 heap:       0.20
  213450 : auto:     1.0919(G) dot:     2.0619 gus:     1.0879 heap:     0.7826 MATLAB     0.1485 speedup auto:       0.14 dot:       0.07 gus:       0.14 heap:       0.19
 8998047 : auto:     0.2204(d) dot:     0.2510 gus:     0.5189 heap:     0.6867 MATLAB     0.3827 speedup auto:       1.74 dot:       1.52 gus:       0.74 heap:       0.56
Elapsed time is 0.342243 seconds.

------------------------------ C = x'*A'
       3 : auto:     0.4467(G) dot:     0.2651 gus:     0.3251 heap:     0.2817 MATLAB     0.0186 speedup auto:       0.04 dot:       0.07 gus:       0.06 heap:       0.07
     300 : auto:     0.5031(G) dot:     0.7213 gus:     0.4871 heap:     0.4577 MATLAB     0.0147 speedup auto:       0.03 dot:       0.02 gus:       0.03 heap:       0.03
    2999 : auto:     0.6340(G) dot:     1.0115 gus:     0.6079 heap:     0.5687 MATLAB     0.0185 speedup auto:       0.03 dot:       0.02 gus:       0.03 heap:       0.03
   14981 : auto:     0.8255(G) dot:     1.3650 gus:     0.7318 heap:     0.6625 MATLAB     0.0293 speedup auto:       0.04 dot:       0.02 gus:       0.04 heap:       0.04
  213377 : auto:     1.3942(G) dot:     2.2670 gus:     1.2889 heap:     0.9877 MATLAB     0.1199 speedup auto:       0.09 dot:       0.05 gus:       0.09 heap:       0.12
 8998047 : auto:     0.5375(d) dot:     0.4987 gus:     0.7331 heap:     0.8634 MATLAB     0.5355 speedup auto:       1.00 dot:       1.07 gus:       0.73 heap:       0.62
Elapsed time is 0.342296 seconds.

------------------------------ C = A'*x'
       3 : auto:     0.2335(h) dot:     0.0455 gus:     0.2338 heap:     0.2192 MATLAB     0.0945 speedup auto:       0.40 dot:       2.07 gus:       0.40 heap:       0.43
     300 : auto:     0.2457(h) dot:     0.1677 gus:     0.2364 heap:     0.2151 MATLAB     0.0812 speedup auto:       0.33 dot:       0.48 gus:       0.34 heap:       0.38
    3000 : auto:     0.2447(h) dot:     0.1714 gus:     0.2298 heap:     0.2146 MATLAB     0.0809 speedup auto:       0.33 dot:       0.47 gus:       0.35 heap:       0.38
   14984 : auto:     0.2375(h) dot:     0.1728 gus:     0.2404 heap:     0.2271 MATLAB     0.0932 speedup auto:       0.39 dot:       0.54 gus:       0.39 heap:       0.41
  213410 : auto:     0.4581(G) dot:     0.2072 gus:     0.3382 heap:     0.3293 MATLAB     0.1428 speedup auto:       0.31 dot:       0.69 gus:       0.42 heap:       0.43
 8998047 : auto:     0.3647(d) dot:     0.3032 gus:     0.6781 heap:     4.4834 MATLAB     0.3575 speedup auto:       0.98 dot:       1.18 gus:       0.53 heap:       0.08
Elapsed time is 0.282500 seconds.

================ ncols 4

------------------------------ C = A'*x
       4 : auto:     0.2520(h) dot:     0.0524 gus:     0.3135 heap:     0.2158 MATLAB     0.2359 speedup auto:       0.94 dot:       4.50 gus:       0.75 heap:       1.09
     400 : auto:     0.2347(h) dot:     0.2050 gus:     0.2211 heap:     0.2181 MATLAB     0.2213 speedup auto:       0.94 dot:       1.08 gus:       1.00 heap:       1.01
    4000 : auto:     0.2303(h) dot:     0.2015 gus:     0.2168 heap:     0.2268 MATLAB     0.2310 speedup auto:       1.00 dot:       1.15 gus:       1.07 heap:       1.02
   19987 : auto:     0.2328(h) dot:     0.2153 gus:     0.2460 heap:     0.2186 MATLAB     0.2605 speedup auto:       1.12 dot:       1.21 gus:       1.06 heap:       1.19
  284600 : auto:     0.4946(G) dot:     0.2844 gus:     0.3736 heap:     0.3403 MATLAB     0.3836 speedup auto:       0.78 dot:       1.35 gus:       1.03 heap:       1.13
11997396 : auto:     0.2612(d) dot:     0.2994 gus:     0.7746 heap:     5.8018 MATLAB     1.1749 speedup auto:       4.50 dot:       3.92 gus:       1.52 heap:       0.20
Elapsed time is 0.232088 seconds.

------------------------------ C = A*x
       4 : auto:     0.0133(h) dot:     0.2743 gus:     0.0129 heap:     0.0132 MATLAB     0.0178 speedup auto:       1.34 dot:       0.06 gus:       1.38 heap:       1.35
     400 : auto:     0.0133(h) dot:     0.4396 gus:     0.0135 heap:     0.0134 MATLAB     0.0097 speedup auto:       0.73 dot:       0.02 gus:       0.72 heap:       0.73
    4000 : auto:     0.0146(h) dot:     0.4330 gus:     0.0173 heap:     0.0146 MATLAB     0.0151 speedup auto:       1.04 dot:       0.03 gus:       0.87 heap:       1.03
   19986 : auto:     0.0202(h) dot:     0.4831 gus:     0.0296 heap:     0.0203 MATLAB     0.0290 speedup auto:       1.44 dot:       0.06 gus:       0.98 heap:       1.43
  284501 : auto:     0.1879(G) dot:     0.5922 gus:     0.1684 heap:     0.1376 MATLAB     0.1587 speedup auto:       0.84 dot:       0.27 gus:       0.94 heap:       1.15
11997396 : auto:     0.4127(G) dot:     0.6264 gus:     0.3847 heap:     5.4754 MATLAB     0.8909 speedup auto:       2.16 dot:       1.42 gus:       2.32 heap:       0.16
Elapsed time is 0.216481 seconds.

------------------------------ C = x'*A
       4 : auto:     0.1290(G) dot:     0.0694 gus:     0.1146 heap:     0.0745 MATLAB     0.1067 speedup auto:       0.83 dot:       1.54 gus:       0.93 heap:       1.43
     400 : auto:     0.3039(G) dot:     0.6754 gus:     0.2945 heap:     0.2530 MATLAB     0.0991 speedup auto:       0.33 dot:       0.15 gus:       0.34 heap:       0.39
    3998 : auto:     0.4567(G) dot:     1.0678 gus:     0.4215 heap:     0.3795 MATLAB     0.0846 speedup auto:       0.19 dot:       0.08 gus:       0.20 heap:       0.22
   19986 : auto:     0.5808(G) dot:     1.5676 gus:     0.5658 heap:     0.5057 MATLAB     0.0959 speedup auto:       0.17 dot:       0.06 gus:       0.17 heap:       0.19
  284660 : auto:     1.1972(G) dot:     2.9035 gus:     1.2874 heap:     0.8521 MATLAB     0.1785 speedup auto:       0.15 dot:       0.06 gus:       0.14 heap:       0.21
11997396 : auto:     0.3083(d) dot:     0.3657 gus:     0.7196 heap:     0.9424 MATLAB     0.5689 speedup auto:       1.85 dot:       1.56 gus:       0.79 heap:       0.60
Elapsed time is 0.359963 seconds.

------------------------------ C = x'*A'
       4 : auto:     0.4483(G) dot:     0.2794 gus:     0.3305 heap:     0.2952 MATLAB     0.0216 speedup auto:       0.05 dot:       0.08 gus:       0.07 heap:       0.07
     400 : auto:     0.5341(G) dot:     0.8565 gus:     0.5592 heap:     0.4758 MATLAB     0.0161 speedup auto:       0.03 dot:       0.02 gus:       0.03 heap:       0.03
    4000 : auto:     0.6659(G) dot:     1.3245 gus:     0.6627 heap:     0.5973 MATLAB     0.0327 speedup auto:       0.05 dot:       0.02 gus:       0.05 heap:       0.05
   19979 : auto:     1.0166(G) dot:     1.9738 gus:     0.8577 heap:     0.7813 MATLAB     0.0353 speedup auto:       0.03 dot:       0.02 gus:       0.04 heap:       0.05
  284468 : auto:     1.7195(G) dot:     3.2038 gus:     1.4308 heap:     1.0440 MATLAB     0.1565 speedup auto:       0.09 dot:       0.05 gus:       0.11 heap:       0.15
11997396 : auto:     0.5859(d) dot:     0.5453 gus:     0.8992 heap:     1.0586 MATLAB     0.7912 speedup auto:       1.35 dot:       1.45 gus:       0.88 heap:       0.75
Elapsed time is 0.385193 seconds.

------------------------------ C = A'*x'
       4 : auto:     0.2321(h) dot:     0.0558 gus:     0.2275 heap:     0.2345 MATLAB     0.0953 speedup auto:       0.41 dot:       1.71 gus:       0.42 heap:       0.41
     400 : auto:     0.2523(h) dot:     0.2100 gus:     0.2265 heap:     0.2292 MATLAB     0.0822 speedup auto:       0.33 dot:       0.39 gus:       0.36 heap:       0.36
    4000 : auto:     0.2505(h) dot:     0.2193 gus:     0.2196 heap:     0.2252 MATLAB     0.0881 speedup auto:       0.35 dot:       0.40 gus:       0.40 heap:       0.39
   19984 : auto:     0.3505(h) dot:     0.2517 gus:     0.2585 heap:     0.2324 MATLAB     0.0851 speedup auto:       0.24 dot:       0.34 gus:       0.33 heap:       0.37
  284566 : auto:     0.5121(G) dot:     0.2796 gus:     0.3778 heap:     0.3586 MATLAB     0.1549 speedup auto:       0.30 dot:       0.55 gus:       0.41 heap:       0.43
11997396 : auto:     0.4176(d) dot:     0.3878 gus:     0.9164 heap:     5.9493 MATLAB     0.4408 speedup auto:       1.06 dot:       1.14 gus:       0.48 heap:       0.07
Elapsed time is 0.321431 seconds.

================ ncols 8

------------------------------ C = A'*x
       8 : auto:     0.2452(h) dot:     0.0976 gus:     0.3142 heap:     0.2157 MATLAB     0.2354 speedup auto:       0.96 dot:       2.41 gus:       0.75 heap:       1.09
     800 : auto:     0.2381(h) dot:     0.3999 gus:     0.2145 heap:     0.2051 MATLAB     0.3436 speedup auto:       1.44 dot:       0.86 gus:       1.60 heap:       1.68
    7999 : auto:     0.2291(h) dot:     0.4525 gus:     0.2188 heap:     0.2241 MATLAB     0.2263 speedup auto:       0.99 dot:       0.50 gus:       1.03 heap:       1.01
   39963 : auto:     0.3525(h) dot:     0.4042 gus:     0.2554 heap:     0.2199 MATLAB     0.2523 speedup auto:       0.72 dot:       0.62 gus:       0.99 heap:       1.15
  569203 : auto:     0.6735(G) dot:     0.5464 gus:     0.5307 heap:     0.4790 MATLAB     0.5196 speedup auto:       0.77 dot:       0.95 gus:       0.98 heap:       1.08
23994792 : auto:     0.6480(d) dot:     0.6528 gus:     1.2562 heap:    11.4999 MATLAB     2.3466 speedup auto:       3.62 dot:       3.59 gus:       1.87 heap:       0.20
Elapsed time is 0.600676 seconds.

------------------------------ C = A*x
       8 : auto:     0.0103(h) dot:     0.2967 gus:     0.0074 heap:     0.0156 MATLAB     0.0240 speedup auto:       2.32 dot:       0.08 gus:       3.25 heap:       1.53
     800 : auto:     0.0004(h) dot:     0.6784 gus:     0.0073 heap:     0.0165 MATLAB     0.0151 speedup auto:      39.30 dot:       0.02 gus:       2.06 heap:       0.91
    7999 : auto:     0.0186(h) dot:     0.7355 gus:     0.0172 heap:     0.0050 MATLAB     0.0267 speedup auto:       1.44 dot:       0.04 gus:       1.55 heap:       5.38
   39966 : auto:     0.0283(h) dot:     0.6515 gus:     0.0429 heap:     0.0313 MATLAB     0.0508 speedup auto:       1.80 dot:       0.08 gus:       1.18 heap:       1.62
  569322 : auto:     0.3374(G) dot:     0.8628 gus:     0.3205 heap:     0.2821 MATLAB     0.3107 speedup auto:       0.92 dot:       0.36 gus:       0.97 heap:       1.10
23994792 : auto:     0.9323(G) dot:     1.0022 gus:     0.9643 heap:    11.1288 MATLAB     1.9375 speedup auto:       2.08 dot:       1.93 gus:       2.01 heap:       0.17
Elapsed time is 0.527815 seconds.

------------------------------ C = x'*A
       8 : auto:     0.1286(G) dot:     0.0881 gus:     0.1181 heap:     0.0844 MATLAB     0.1409 speedup auto:       1.10 dot:       1.60 gus:       1.19 heap:       1.67
     800 : auto:     0.3749(G) dot:     1.2481 gus:     0.3148 heap:     0.2697 MATLAB     0.0912 speedup auto:       0.24 dot:       0.07 gus:       0.29 heap:       0.34
    7998 : auto:     0.4757(G) dot:     2.1080 gus:     0.4729 heap:     0.4021 MATLAB     0.0877 speedup auto:       0.18 dot:       0.04 gus:       0.19 heap:       0.22
   39968 : auto:     0.6941(G) dot:     3.1377 gus:     0.6659 heap:     0.5478 MATLAB     0.1055 speedup auto:       0.15 dot:       0.03 gus:       0.16 heap:       0.19
  569041 : auto:     0.2769(G) dot:     6.3380 gus:     0.2517 heap:     0.2112 MATLAB     0.2242 speedup auto:       0.81 dot:       0.04 gus:       0.89 heap:       1.06
23994792 : auto:     0.7675(d) dot:     0.7199 gus:     1.3994 heap:     1.9811 MATLAB     1.4354 speedup auto:       1.87 dot:       1.99 gus:       1.03 heap:       0.72
Elapsed time is 1.180739 seconds.

------------------------------ C = x'*A'
       8 : auto:     0.3870(G) dot:     0.2927 gus:     0.3260 heap:     0.2835 MATLAB     0.0278 speedup auto:       0.07 dot:       0.09 gus:       0.09 heap:       0.10
     800 : auto:     0.5841(G) dot:     1.4540 gus:     0.6042 heap:     0.4906 MATLAB     0.0146 speedup auto:       0.02 dot:       0.01 gus:       0.02 heap:       0.03
    7997 : auto:     0.8025(G) dot:     2.3245 gus:     0.6863 heap:     0.6070 MATLAB     0.0251 speedup auto:       0.03 dot:       0.01 gus:       0.04 heap:       0.04
   39963 : auto:     0.9791(G) dot:     3.4345 gus:     0.8788 heap:     0.7810 MATLAB     0.0510 speedup auto:       0.05 dot:       0.01 gus:       0.06 heap:       0.07
  569166 : auto:     0.6189(G) dot:     6.6914 gus:     0.4645 heap:     0.4338 MATLAB     0.2720 speedup auto:       0.44 dot:       0.04 gus:       0.59 heap:       0.63
23994792 : auto:     1.1948(d) dot:     0.9752 gus:     1.6497 heap:     2.2069 MATLAB     1.9524 speedup auto:       1.63 dot:       2.00 gus:       1.18 heap:       0.88
Elapsed time is 1.189189 seconds.

------------------------------ C = A'*x'
       8 : auto:     0.2309(h) dot:     0.0993 gus:     0.3022 heap:     0.2214 MATLAB     0.1050 speedup auto:       0.45 dot:       1.06 gus:       0.35 heap:       0.47
     800 : auto:     0.2340(h) dot:     0.4103 gus:     0.2190 heap:     0.2398 MATLAB     0.0870 speedup auto:       0.37 dot:       0.21 gus:       0.40 heap:       0.36
    7998 : auto:     0.3263(h) dot:     0.4184 gus:     0.2308 heap:     0.2211 MATLAB     0.1023 speedup auto:       0.31 dot:       0.24 gus:       0.44 heap:       0.46
   39958 : auto:     0.3698(h) dot:     0.4233 gus:     0.2547 heap:     0.2344 MATLAB     0.0962 speedup auto:       0.26 dot:       0.23 gus:       0.38 heap:       0.41
  569119 : auto:     0.7180(G) dot:     0.6431 gus:     0.5539 heap:     0.5554 MATLAB     0.2022 speedup auto:       0.28 dot:       0.31 gus:       0.36 heap:       0.36
23994792 : auto:     1.0951(d) dot:     1.0843 gus:     1.7083 heap:    12.1058 MATLAB     1.0918 speedup auto:       1.00 dot:       1.01 gus:       0.64 heap:       0.09
Elapsed time is 0.936786 seconds.

================ ncols 16

------------------------------ C = A'*x
      16 : auto:     0.3209(h) dot:     0.1748 gus:     0.2574 heap:     0.2141 MATLAB     0.2508 speedup auto:       0.78 dot:       1.43 gus:       0.97 heap:       1.17
    1599 : auto:     0.2362(h) dot:     0.8162 gus:     0.2127 heap:     0.2305 MATLAB     0.3233 speedup auto:       1.37 dot:       0.40 gus:       1.52 heap:       1.40
   15995 : auto:     0.2451(h) dot:     0.8043 gus:     0.2432 heap:     0.2359 MATLAB     0.2382 speedup auto:       0.97 dot:       0.30 gus:       0.98 heap:       1.01
   79924 : auto:     0.3720(h) dot:     0.8775 gus:     0.2918 heap:     0.2499 MATLAB     0.2939 speedup auto:       0.79 dot:       0.33 gus:       1.01 heap:       1.18
 1138397 : auto:     0.9946(G) dot:     1.0958 gus:     0.8551 heap:     0.8250 MATLAB     0.8139 speedup auto:       0.82 dot:       0.74 gus:       0.95 heap:       0.99
47989584 : auto:     1.2976(d) dot:     1.3650 gus:     2.1611 heap:    23.6982 MATLAB     4.7148 speedup auto:       3.63 dot:       3.45 gus:       2.18 heap:       0.20
Elapsed time is 1.160734 seconds.

------------------------------ C = A*x
      16 : auto:     0.0070(h) dot:     0.5005 gus:     0.0076 heap:     0.0162 MATLAB     0.0405 speedup auto:       5.81 dot:       0.08 gus:       5.31 heap:       2.50
    1600 : auto:     0.0007(h) dot:     1.1098 gus:     0.0088 heap:     0.0039 MATLAB     0.0154 speedup auto:      20.66 dot:       0.01 gus:       1.76 heap:       3.94
   15995 : auto:     0.0216(h) dot:     1.0372 gus:     0.0212 heap:     0.0069 MATLAB     0.0257 speedup auto:       1.19 dot:       0.02 gus:       1.21 heap:       3.70
   79923 : auto:     0.0425(h) dot:     1.0571 gus:     0.0735 heap:     0.0774 MATLAB     0.0927 speedup auto:       2.18 dot:       0.09 gus:       1.26 heap:       1.20
 1138341 : auto:     0.6395(G) dot:     1.4223 gus:     0.6375 heap:     0.5677 MATLAB     0.6007 speedup auto:       0.94 dot:       0.42 gus:       0.94 heap:       1.06
47989584 : auto:     1.9451(G) dot:     1.5511 gus:     1.9226 heap:    22.8535 MATLAB     3.9964 speedup auto:       2.05 dot:       2.58 gus:       2.08 heap:       0.17
Elapsed time is 1.097486 seconds.

------------------------------ C = x'*A
      16 : auto:     0.1248(G) dot:     0.1065 gus:     0.1260 heap:     0.0907 MATLAB     0.1177 speedup auto:       0.94 dot:       1.10 gus:       0.93 heap:       1.30
    1600 : auto:     0.3795(G) dot:     2.3987 gus:     0.3710 heap:     0.3236 MATLAB     0.1160 speedup auto:       0.31 dot:       0.05 gus:       0.31 heap:       0.36
   16000 : auto:     0.5659(G) dot:     4.3482 gus:     0.5532 heap:     0.4833 MATLAB     0.0916 speedup auto:       0.16 dot:       0.02 gus:       0.17 heap:       0.19
   79937 : auto:     0.8266(G) dot:     7.1723 gus:     0.8033 heap:     0.6384 MATLAB     0.1170 speedup auto:       0.14 dot:       0.02 gus:       0.15 heap:       0.18
 1138383 : auto:     0.4164(G) dot:    14.5074 gus:     0.4112 heap:     0.3935 MATLAB     0.3909 speedup auto:       0.94 dot:       0.03 gus:       0.95 heap:       0.99
47989584 : auto:     1.4991(d) dot:     1.4963 gus:     3.2452 heap:     4.3386 MATLAB     3.1926 speedup auto:       2.13 dot:       2.13 gus:       0.98 heap:       0.74
Elapsed time is 2.835632 seconds.

------------------------------ C = x'*A'
      16 : auto:     0.3952(G) dot:     0.3253 gus:     0.3531 heap:     0.2801 MATLAB     0.0429 speedup auto:       0.11 dot:       0.13 gus:       0.12 heap:       0.15
    1600 : auto:     0.5947(G) dot:     2.6954 gus:     0.5998 heap:     0.5427 MATLAB     0.0157 speedup auto:       0.03 dot:       0.01 gus:       0.03 heap:       0.03
   15997 : auto:     0.9226(G) dot:     4.5385 gus:     0.7478 heap:     0.6664 MATLAB     0.0328 speedup auto:       0.04 dot:       0.01 gus:       0.04 heap:       0.05
   79924 : auto:     1.1552(G) dot:     7.4058 gus:     1.0072 heap:     0.8574 MATLAB     0.0911 speedup auto:       0.08 dot:       0.01 gus:       0.09 heap:       0.11
 1138241 : auto:     0.7552(G) dot:    14.6657 gus:     0.6039 heap:     0.5904 MATLAB     0.6493 speedup auto:       0.86 dot:       0.04 gus:       1.08 heap:       1.10
47989584 : auto:     1.8347(d) dot:     1.8326 gus:     3.4596 heap:     4.6007 MATLAB     4.1416 speedup auto:       2.26 dot:       2.26 gus:       1.20 heap:       0.90
Elapsed time is 3.077115 seconds.

------------------------------ C = A'*x'
      16 : auto:     0.2520(h) dot:     0.1891 gus:     0.3113 heap:     0.2180 MATLAB     0.1141 speedup auto:       0.45 dot:       0.60 gus:       0.37 heap:       0.52
    1600 : auto:     0.2274(h) dot:     0.8051 gus:     0.2162 heap:     0.2187 MATLAB     0.0876 speedup auto:       0.39 dot:       0.11 gus:       0.41 heap:       0.40
   15994 : auto:     0.3259(h) dot:     0.7885 gus:     0.2320 heap:     0.2241 MATLAB     0.0852 speedup auto:       0.26 dot:       0.11 gus:       0.37 heap:       0.38
   79936 : auto:     0.3785(h) dot:     0.8678 gus:     0.2949 heap:     0.2423 MATLAB     0.1076 speedup auto:       0.28 dot:       0.12 gus:       0.37 heap:       0.44
 1138422 : auto:     0.9714(G) dot:     1.1783 gus:     0.8568 heap:     0.8158 MATLAB     0.3435 speedup auto:       0.35 dot:       0.29 gus:       0.40 heap:       0.42
47989584 : auto:     2.2084(d) dot:     2.1472 gus:     2.9624 heap:    23.6609 MATLAB     2.0834 speedup auto:       0.94 dot:       0.97 gus:       0.70 heap:       0.09
Elapsed time is 1.798996 seconds.

test48: all tests passed

--------------performance test GB_mex_assign
nnzB: 38479
MATLAB start:
Elapsed time is 83.697335 seconds.
GraphBLAS start:
Elapsed time is 0.765709 seconds.
MATLAB start:
Elapsed time is 83.361143 seconds.
GraphBLAS start:
Elapsed time is 0.622931 seconds.

test46b: all tests passed
m   1 n   1 MATLAB:  0.0072751  GrB:  0.0075179  speedup       0.97  err: 0
m   1 n   2 MATLAB:   0.010535  GrB:   0.030622  speedup       0.34  err: 0
m   1 n   3 MATLAB:    0.02826  GrB:   0.018682  speedup       1.51  err: 0
m   1 n   4 MATLAB:   0.033252  GrB:   0.020605  speedup       1.61  err: 0
m   2 n   1 MATLAB:   0.010642  GrB:   0.019903  speedup       0.53  err: 0
m   2 n   2 MATLAB:    0.02712  GrB:    0.02684  speedup       1.01  err: 0
m   2 n   3 MATLAB:    0.03663  GrB:   0.031539  speedup       1.16  err: 0
m   2 n   4 MATLAB:   0.046461  GrB:    0.04306  speedup       1.08  err: 0
m   3 n   1 MATLAB:   0.016584  GrB:    0.02278  speedup       0.73  err: 0
m   3 n   2 MATLAB:   0.074527  GrB:   0.033304  speedup       2.24  err: 0
m   3 n   3 MATLAB:   0.052638  GrB:   0.041001  speedup       1.28  err: 0
m   3 n   4 MATLAB:   0.067988  GrB:   0.045051  speedup       1.51  err: 0
m   4 n   1 MATLAB:   0.026785  GrB:   0.018565  speedup       1.44  err: 0
m   4 n   2 MATLAB:   0.052129  GrB:   0.034374  speedup       1.52  err: 0
m   4 n   3 MATLAB:   0.077515  GrB:   0.042494  speedup       1.82  err: 0
m   4 n   4 MATLAB:    0.09692  GrB:   0.055375  speedup       1.75  err: 0

test49: all tests passed

-----------performance test GB_mex_subassign, multiple ops

----------------------------------------------------- 1

--------------------------
C(I,J) = accum(C(I,J),A) one op, assemble at end
number of C(I,J) = ... to do: 10
ni: [ 40 to 941]
nj: [ 1 to 1]
nz: [ 4 to 91]
C is 1000-by-1 nnz(A): 90  nz to add: 393  matrices: 10
GraphBLAS time: 0.00103301
final nnz: 394
start MATLAB...
MATLAB    time: 0.0136533
GraphBLAS speedup: 13.2171

----------------------------------------------------- 2

--------------------------
C(I,J) = accum(C(I,J),A) one op, assemble at end
number of C(I,J) = ... to do: 1000
ni: [ 1 to 64]
nj: [ 1 to 64]
nz: [ 1 to 383]
C is 67-by-67 nnz(A): 294  nz to add: 101741  matrices: 1000
GraphBLAS time: 0.0720125
final nnz: 4489
start MATLAB...
MATLAB    time: 0.0796817
GraphBLAS speedup: 1.1065

----------------------------------------------------- 3

--------------------------
C(I,J) = accum(C(I,J),A) one op, assemble at end
number of C(I,J) = ... to do: 1000
ni: [ 1 to 64]
nj: [ 1 to 64]
nz: [ 1 to 388]
C is 80-by-80 nnz(A): 2126  nz to add: 101315  matrices: 1000
GraphBLAS time: 0.0205068
final nnz: 6400
start MATLAB...
MATLAB    time: 0.0860834
GraphBLAS speedup: 4.1978

----------------------------------------------------- 4

--------------------------
C(I,J) = accum(C(I,J),A) one op, assemble at end
number of C(I,J) = ... to do: 50
ni: [ 19 to 49126]
nj: [ 26 to 496]
nz: [ 0 to 83]
C is 50000-by-500 nnz(A): 25000000  nz to add: 1192  matrices: 50
GraphBLAS time: 0.278895
final nnz: 25000000
start MATLAB...
MATLAB    time: 64.0839
GraphBLAS speedup: 229.778

----------------------------------------------------- 5

Prob = 

  struct with fields:

      name: 'Freescale/Freescale2'
     title: 'circuit simulation matrix from Freescale'
         A: [29993492999349 double]
     Zeros: [29993492999349 double]
        id: 2662
      date: '2015'
    author: 'K. Gullapalli'
        ed: 'T. Davis'
      kind: 'circuit simulation matrix'
     notes: [459 char]


--------------------------
C(I,J) = accum(C(I,J),A) one op, assemble at end
number of C(I,J) = ... to do: 100
ni: [ 21 to 1011]
nj: [ 7 to 1002]
nz: [ 6 to 1750]
C is 2999349-by-2999349 nnz(A): 14313235  nz to add: 53923  matrices: 100
GraphBLAS time: 0.647364
final nnz: 14367158
start MATLAB...
MATLAB    time: 377.443
GraphBLAS speedup: 583.046

test51: all tests passed

-----------performance test GB_mex_assign, multiple ops

----------------------------------------------------- 1

--------------------------
C(I,J) = accum(C(I,J),A) one op, assemble at end
number of C(I,J) = ... to do: 10
ni: [ 40 to 941]
nj: [ 1 to 1]
nz: [ 4 to 91]
C is 1000-by-1 nnz(A): 90  nz to add: 393  matrices: 10
GraphBLAS time: 0.000255714
final nnz: 394
start MATLAB...
MATLAB    time: 0.00409087
GraphBLAS speedup: 15.9978

----------------------------------------------------- 2

--------------------------
C(I,J) = accum(C(I,J),A) one op, assemble at end
number of C(I,J) = ... to do: 1000
ni: [ 1 to 64]
nj: [ 1 to 64]
nz: [ 1 to 383]
C is 67-by-67 nnz(A): 294  nz to add: 101741  matrices: 1000
GraphBLAS time: 0.00971653
final nnz: 4489
start MATLAB...
MATLAB    time: 0.0697119
GraphBLAS speedup: 7.17457

----------------------------------------------------- 3

--------------------------
C(I,J) = accum(C(I,J),A) one op, assemble at end
number of C(I,J) = ... to do: 1000
ni: [ 1 to 64]
nj: [ 1 to 64]
nz: [ 1 to 388]
C is 80-by-80 nnz(A): 2126  nz to add: 101315  matrices: 1000
GraphBLAS time: 0.0128309
final nnz: 6400
start MATLAB...
MATLAB    time: 0.0633754
GraphBLAS speedup: 4.93927

----------------------------------------------------- 4

--------------------------
C(I,J) = accum(C(I,J),A) one op, assemble at end
number of C(I,J) = ... to do: 50
ni: [ 19 to 49126]
nj: [ 26 to 496]
nz: [ 0 to 83]
C is 50000-by-500 nnz(A): 25000000  nz to add: 1192  matrices: 50
GraphBLAS time: 0.262797
final nnz: 25000000
start MATLAB...
MATLAB    time: 63.8561
GraphBLAS speedup: 242.986

----------------------------------------------------- 5

Prob = 

  struct with fields:

      name: 'Freescale/Freescale2'
     title: 'circuit simulation matrix from Freescale'
         A: [29993492999349 double]
     Zeros: [29993492999349 double]
        id: 2662
      date: '2015'
    author: 'K. Gullapalli'
        ed: 'T. Davis'
      kind: 'circuit simulation matrix'
     notes: [459 char]


--------------------------
C(I,J) = accum(C(I,J),A) one op, assemble at end
number of C(I,J) = ... to do: 100
ni: [ 21 to 1011]
nj: [ 7 to 1002]
nz: [ 6 to 1750]
C is 2999349-by-2999349 nnz(A): 14313235  nz to add: 53923  matrices: 100
GraphBLAS time: 10.2278
final nnz: 14367158
start MATLAB...
MATLAB    time: 377.54
GraphBLAS speedup: 36.913

test51b: all tests passed

test58: ----- quick performance for GB_mex_eWiseAdd_Matrix
MATLAB: 0.782267 GB: 1.03916  speedup: 0.752789
A+B:   m     10 n     10 nz        3: MATLAB   0.0001 GrB   0.0000 speedup 4.67858
A+B':  m     10 n     10 nz        3: MATLAB   0.0001 GrB   0.0000 speedup 4.20028
A'+B:  m     10 n     10 nz        3: MATLAB   0.0001 GrB   0.0000 speedup 4.07097
A'+B': m     10 n     10 nz        3: MATLAB   0.0001 GrB   0.0000 speedup 4.29587
A+B:   m     10 n    100 nz       30: MATLAB   0.0000 GrB   0.0000 speedup 0.390469
A+B':  m     10 n    100 nz       30: MATLAB   0.0000 GrB   0.0000 speedup 0.697451
A'+B:  m     10 n    100 nz       30: MATLAB   0.0000 GrB   0.0000 speedup 0.376319
A'+B': m     10 n    100 nz       30: MATLAB   0.0000 GrB   0.0000 speedup 0.466648
A+B:   m     10 n   1000 nz      296: MATLAB   0.0000 GrB   0.0000 speedup 0.419087
A+B':  m     10 n   1000 nz      296: MATLAB   0.0000 GrB   0.0000 speedup 0.353158
A'+B:  m     10 n   1000 nz      296: MATLAB   0.0000 GrB   0.0000 speedup 0.32978
A'+B': m     10 n   1000 nz      296: MATLAB   0.0000 GrB   0.0000 speedup 0.488207
A+B:   m     10 n  10000 nz     2957: MATLAB   0.0001 GrB   0.0002 speedup 0.505175
A+B':  m     10 n  10000 nz     2957: MATLAB   0.0001 GrB   0.0003 speedup 0.367015
A'+B:  m     10 n  10000 nz     2957: MATLAB   0.0001 GrB   0.0003 speedup 0.346489
A'+B': m     10 n  10000 nz     2957: MATLAB   0.0001 GrB   0.0003 speedup 0.575542
A+B:   m     10 n  50000 nz    14800: MATLAB   0.0006 GrB   0.0011 speedup 0.59562
A+B':  m     10 n  50000 nz    14800: MATLAB   0.0007 GrB   0.0016 speedup 0.435895
A'+B:  m     10 n  50000 nz    14800: MATLAB   0.0007 GrB   0.0017 speedup 0.423336
A'+B': m     10 n  50000 nz    14800: MATLAB   0.0008 GrB   0.0013 speedup 0.640121
A+B:   m    100 n     10 nz       30: MATLAB   0.0004 GrB   0.0000 speedup 19.7143
A+B':  m    100 n     10 nz       30: MATLAB   0.0000 GrB   0.0000 speedup 0.377199
A'+B:  m    100 n     10 nz       30: MATLAB   0.0000 GrB   0.0000 speedup 0.329497
A'+B': m    100 n     10 nz       30: MATLAB   0.0000 GrB   0.0000 speedup 0.412459
A+B:   m    100 n    100 nz      297: MATLAB   0.0000 GrB   0.0000 speedup 0.436461
A+B':  m    100 n    100 nz      297: MATLAB   0.0000 GrB   0.0000 speedup 0.419843
A'+B:  m    100 n    100 nz      297: MATLAB   0.0000 GrB   0.0000 speedup 0.388043
A'+B': m    100 n    100 nz      297: MATLAB   0.0000 GrB   0.0000 speedup 0.365516
A+B:   m    100 n   1000 nz     2956: MATLAB   0.0000 GrB   0.0001 speedup 0.494705
A+B':  m    100 n   1000 nz     2956: MATLAB   0.0000 GrB   0.0001 speedup 0.34006
A'+B:  m    100 n   1000 nz     2956: MATLAB   0.0000 GrB   0.0001 speedup 0.318347
A'+B': m    100 n   1000 nz     2956: MATLAB   0.0001 GrB   0.0002 speedup 0.324409
A+B:   m    100 n  10000 nz    29543: MATLAB   0.0004 GrB   0.0007 speedup 0.59429
A+B':  m    100 n  10000 nz    29543: MATLAB   0.0005 GrB   0.0014 speedup 0.386265
A'+B:  m    100 n  10000 nz    29543: MATLAB   0.0005 GrB   0.0014 speedup 0.397189
A'+B': m    100 n  10000 nz    29543: MATLAB   0.0006 GrB   0.0023 speedup 0.278917
A+B:   m    100 n  50000 nz   147781: MATLAB   0.0028 GrB   0.0038 speedup 0.734977
A+B':  m    100 n  50000 nz   147781: MATLAB   0.0036 GrB   0.0077 speedup 0.460605
A'+B:  m    100 n  50000 nz   147781: MATLAB   0.0038 GrB   0.0086 speedup 0.445318
A'+B': m    100 n  50000 nz   147781: MATLAB   0.0045 GrB   0.0112 speedup 0.397423
A+B:   m   1000 n     10 nz      289: MATLAB   0.0000 GrB   0.0000 speedup 1.13535
A+B':  m   1000 n     10 nz      289: MATLAB   0.0000 GrB   0.0000 speedup 0.501748
A'+B:  m   1000 n     10 nz      289: MATLAB   0.0000 GrB   0.0000 speedup 0.586243
A'+B': m   1000 n     10 nz      289: MATLAB   0.0000 GrB   0.0000 speedup 0.434422
A+B:   m   1000 n    100 nz     2948: MATLAB   0.0000 GrB   0.0001 speedup 0.440061
A+B':  m   1000 n    100 nz     2948: MATLAB   0.0000 GrB   0.0001 speedup 0.441924
A'+B:  m   1000 n    100 nz     2948: MATLAB   0.0000 GrB   0.0001 speedup 0.436783
A'+B': m   1000 n    100 nz     2948: MATLAB   0.0000 GrB   0.0003 speedup 0.167288
A+B:   m   1000 n   1000 nz    29532: MATLAB   0.0003 GrB   0.0005 speedup 0.537921
A+B':  m   1000 n   1000 nz    29532: MATLAB   0.0003 GrB   0.0007 speedup 0.483278
A'+B:  m   1000 n   1000 nz    29532: MATLAB   0.0003 GrB   0.0006 speedup 0.588169
A'+B': m   1000 n   1000 nz    29532: MATLAB   0.0004 GrB   0.0025 speedup 0.173329
A+B:   m   1000 n  10000 nz   295476: MATLAB   0.0028 GrB   0.0054 speedup 0.519281
A+B':  m   1000 n  10000 nz   295476: MATLAB   0.0046 GrB   0.0070 speedup 0.656167
A'+B:  m   1000 n  10000 nz   295476: MATLAB   0.0044 GrB   0.0065 speedup 0.673582
A'+B': m   1000 n  10000 nz   295476: MATLAB   0.0057 GrB   0.0249 speedup 0.227737
A+B:   m   1000 n  50000 nz  1478087: MATLAB   0.0232 GrB   0.0324 speedup 0.715837
A+B':  m   1000 n  50000 nz  1478087: MATLAB   0.0377 GrB   0.0482 speedup 0.781995
A'+B:  m   1000 n  50000 nz  1478087: MATLAB   0.0402 GrB   0.0469 speedup 0.857493
A'+B': m   1000 n  50000 nz  1478087: MATLAB   0.0541 GrB   0.1417 speedup 0.381714
A+B:   m  10000 n     10 nz     2949: MATLAB   0.0002 GrB   0.0001 speedup 3.65361
A+B':  m  10000 n     10 nz     2949: MATLAB   0.0000 GrB   0.0001 speedup 0.59016
A'+B:  m  10000 n     10 nz     2949: MATLAB   0.0000 GrB   0.0001 speedup 0.553527
A'+B': m  10000 n     10 nz     2949: MATLAB   0.0001 GrB   0.0003 speedup 0.245552
A+B:   m  10000 n    100 nz    29552: MATLAB   0.0003 GrB   0.0007 speedup 0.478029
A+B':  m  10000 n    100 nz    29552: MATLAB   0.0006 GrB   0.0007 speedup 0.931402
A'+B:  m  10000 n    100 nz    29552: MATLAB   0.0003 GrB   0.0007 speedup 0.464285
A'+B': m  10000 n    100 nz    29552: MATLAB   0.0012 GrB   0.0030 speedup 0.405361
A+B:   m  10000 n   1000 nz   295430: MATLAB   0.0027 GrB   0.0055 speedup 0.492575
A+B':  m  10000 n   1000 nz   295430: MATLAB   0.0047 GrB   0.0059 speedup 0.791662
A'+B:  m  10000 n   1000 nz   295430: MATLAB   0.0042 GrB   0.0061 speedup 0.683893
A'+B': m  10000 n   1000 nz   295430: MATLAB   0.0056 GrB   0.0264 speedup 0.210662
A+B:   m  10000 n  10000 nz  2955594: MATLAB   0.0487 GrB   0.0882 speedup 0.551435
A+B':  m  10000 n  10000 nz  2955594: MATLAB   0.1114 GrB   0.1025 speedup 1.0868
A'+B:  m  10000 n  10000 nz  2955594: MATLAB   0.0853 GrB   0.0914 speedup 0.932738
A'+B': m  10000 n  10000 nz  2955594: MATLAB   0.1116 GrB   0.3199 speedup 0.34881
A+B:   m  10000 n  50000 nz 14776867: MATLAB   0.2400 GrB   0.4230 speedup 0.567464
A+B':  m  10000 n  50000 nz 14776867: MATLAB   0.5615 GrB   0.5625 speedup 0.998286
A'+B:  m  10000 n  50000 nz 14776867: MATLAB   0.4924 GrB   0.5436 speedup 0.90593
A'+B': m  10000 n  50000 nz 14776867: MATLAB   0.6842 GrB   1.7247 speedup 0.396702
A+B:   m  50000 n     10 nz    14789: MATLAB   0.0017 GrB   0.0002 speedup 7.00452
A+B':  m  50000 n     10 nz    14789: MATLAB   0.0002 GrB   0.0004 speedup 0.572773
A'+B:  m  50000 n     10 nz    14789: MATLAB   0.0002 GrB   0.0003 speedup 0.622472
A'+B': m  50000 n     10 nz    14789: MATLAB   0.0008 GrB   0.0021 speedup 0.391155
A+B:   m  50000 n    100 nz   147801: MATLAB   0.0018 GrB   0.0029 speedup 0.608358
A+B':  m  50000 n    100 nz   147801: MATLAB   0.0025 GrB   0.0034 speedup 0.736992
A'+B:  m  50000 n    100 nz   147801: MATLAB   0.0023 GrB   0.0034 speedup 0.678671
A'+B': m  50000 n    100 nz   147801: MATLAB   0.0029 GrB   0.0141 speedup 0.207875
A+B:   m  50000 n   1000 nz  1477729: MATLAB   0.0194 GrB   0.0322 speedup 0.601367
A+B':  m  50000 n   1000 nz  1477729: MATLAB   0.0295 GrB   0.0394 speedup 0.749213
A'+B:  m  50000 n   1000 nz  1477729: MATLAB   0.0306 GrB   0.0381 speedup 0.802619
A'+B': m  50000 n   1000 nz  1477729: MATLAB   0.0370 GrB   0.1610 speedup 0.229571
A+B:   m  50000 n  10000 nz 14777595: MATLAB   0.2360 GrB   0.4242 speedup 0.55639
A+B':  m  50000 n  10000 nz 14777595: MATLAB   0.5057 GrB   0.5161 speedup 0.979841
A'+B:  m  50000 n  10000 nz 14777595: MATLAB   0.4631 GrB   0.5051 speedup 0.916754
A'+B': m  50000 n  10000 nz 14777595: MATLAB   0.5997 GrB   1.8115 speedup 0.331073
A+B:   m  50000 n  50000 nz 73886295: MATLAB   1.8572 GrB   3.7697 speedup 0.49265
A+B':  m  50000 n  50000 nz 73886295: MATLAB   4.9557 GrB   4.3604 speedup 1.13653
A'+B:  m  50000 n  50000 nz 73886295: MATLAB   4.6604 GrB   4.4534 speedup 1.0465
A'+B': m  50000 n  50000 nz 73886295: MATLAB   5.2006 GrB  10.7067 speedup 0.485733

test58: all tests passed

----------------------------- eMult performance tests

Prob = 

  struct with fields:

      name: 'Freescale/Freescale2'
     title: 'circuit simulation matrix from Freescale'
         A: [29993492999349 double]
     Zeros: [29993492999349 double]
        id: 2662
      date: '2015'
    author: 'K. Gullapalli'
        ed: 'T. Davis'
      kind: 'circuit simulation matrix'
     notes: [459 char]



m: 2999349 n 2999349 nnz(A) 29294197
d 3.25632e-06 nnz(C) 14313235 MATLAB   0.246970 GB   0.235818  speedup     1.0473


m: 5000 n 5000 nnz(A) 25000000
d      1e-05 nnz(C)      250 MATLAB   0.043841 GB   0.032638  speedup     1.3433
d      2e-05 nnz(C)      500 MATLAB   0.024088 GB   0.000691  speedup    34.8667
d      3e-05 nnz(C)      750 MATLAB   0.037153 GB   0.000803  speedup    46.2610
d      4e-05 nnz(C)     1000 MATLAB   0.028674 GB   0.001990  speedup    14.4074
d      5e-05 nnz(C)     1250 MATLAB   0.028957 GB   0.001644  speedup    17.6183
d      6e-05 nnz(C)     1500 MATLAB   0.031681 GB   0.001364  speedup    23.2330
d      7e-05 nnz(C)     1750 MATLAB   0.033389 GB   0.001570  speedup    21.2606
d      8e-05 nnz(C)     2000 MATLAB   0.046668 GB   0.004562  speedup    10.2304
d      9e-05 nnz(C)     2249 MATLAB   0.027869 GB   0.001991  speedup    13.9987
d     0.0001 nnz(C)     2500 MATLAB   0.040436 GB   0.002128  speedup    19.0010
d     0.0002 nnz(C)     4999 MATLAB   0.043894 GB   0.004955  speedup     8.8577
d     0.0003 nnz(C)     7497 MATLAB   0.046734 GB   0.007719  speedup     6.0544
d     0.0004 nnz(C)     9998 MATLAB   0.038687 GB   0.019711  speedup     1.9627
d     0.0005 nnz(C)    12497 MATLAB   0.053280 GB   0.011573  speedup     4.6040
d     0.0006 nnz(C)    14994 MATLAB   0.051925 GB   0.013583  speedup     3.8229
d     0.0007 nnz(C)    17495 MATLAB   0.066758 GB   0.014664  speedup     4.5526
d     0.0008 nnz(C)    19990 MATLAB   0.058632 GB   0.013401  speedup     4.3752
d     0.0009 nnz(C)    22486 MATLAB   0.054637 GB   0.020096  speedup     2.7188
d      0.001 nnz(C)    24985 MATLAB   0.054822 GB   0.015765  speedup     3.4774
d      0.002 nnz(C)    49965 MATLAB   0.069182 GB   0.029826  speedup     2.3195
d      0.003 nnz(C)    74897 MATLAB   0.071423 GB   0.036522  speedup     1.9556
d      0.004 nnz(C)    99815 MATLAB   0.060769 GB   0.033708  speedup     1.8028
d      0.005 nnz(C)   124671 MATLAB   0.066411 GB   0.034345  speedup     1.9336
d      0.006 nnz(C)   149547 MATLAB   0.053656 GB   0.026914  speedup     1.9936
d      0.007 nnz(C)   174388 MATLAB   0.056994 GB   0.029771  speedup     1.9144
d      0.008 nnz(C)   199230 MATLAB   0.061960 GB   0.029653  speedup     2.0895
d      0.009 nnz(C)   223996 MATLAB   0.062140 GB   0.034482  speedup     1.8021
d       0.01 nnz(C)   248768 MATLAB   0.062828 GB   0.037443  speedup     1.6780
d       0.02 nnz(C)   495144 MATLAB   0.069711 GB   0.046180  speedup     1.5095
d       0.03 nnz(C)   738787 MATLAB   0.072271 GB   0.049782  speedup     1.4518
d       0.04 nnz(C)   980341 MATLAB   0.078037 GB   0.053157  speedup     1.4681
d       0.05 nnz(C)  1219398 MATLAB   0.091523 GB   0.055499  speedup     1.6491
d       0.06 nnz(C)  1455882 MATLAB   0.096724 GB   0.056914  speedup     1.6995
d       0.07 nnz(C)  1690678 MATLAB   0.097676 GB   0.065110  speedup     1.5002
d       0.08 nnz(C)  1922327 MATLAB   0.106658 GB   0.059462  speedup     1.7937
d       0.09 nnz(C)  2151287 MATLAB   0.105276 GB   0.066640  speedup     1.5798
d        0.1 nnz(C)  2379472 MATLAB   0.113040 GB   0.073890  speedup     1.5298
d          1 nnz(C) 15804925 MATLAB   0.311666 GB   0.222164  speedup     1.4029

test61: all tests passed

---------------------------- quick test of GrB_apply
MATLAB, full: 0.0235
MATLAB 0.1115  GB 0.1201 speedup 0.927899
MATLAB, full: 0.0374
MATLAB 0.0744  GB 0.1444 speedup 0.515252
MATLAB 0.0392  GB 0.0334 speedup 1.17502
MATLAB 0.0397  GB 0.0071 speedup 5.58533
MATLAB 0.0391  GB 0.0040 speedup 9.8344
d    0.000 MATLAB 0.0372  GB 0.0002 speedup 187.98
d    0.002 MATLAB 0.0537  GB 0.0098 speedup 5.45739
d    0.004 MATLAB 0.0506  GB 0.0159 speedup 3.18835
d    0.006 MATLAB 0.0474  GB 0.0134 speedup 3.53161
d    0.008 MATLAB 0.0506  GB 0.0151 speedup 3.34343
d    0.010 MATLAB 0.0433  GB 0.0155 speedup 2.79431
d    0.012 MATLAB 0.0407  GB 0.0203 speedup 2.00319
d    0.014 MATLAB 0.0236  GB 0.0260 speedup 0.906564
d    0.016 MATLAB 0.0310  GB 0.0202 speedup 1.53339
d    0.018 MATLAB 0.0304  GB 0.0216 speedup 1.40671
d    0.020 MATLAB 0.0265  GB 0.0206 speedup 1.28545
d    0.022 MATLAB 0.0260  GB 0.0248 speedup 1.04654
d    0.024 MATLAB 0.0282  GB 0.0219 speedup 1.28587
d    0.026 MATLAB 0.0322  GB 0.0262 speedup 1.23214
d    0.028 MATLAB 0.0312  GB 0.0227 speedup 1.37427
d    0.030 MATLAB 0.0292  GB 0.0251 speedup 1.16612
d    0.032 MATLAB 0.0292  GB 0.0261 speedup 1.11794
d    0.034 MATLAB 0.0292  GB 0.0238 speedup 1.22601
d    0.036 MATLAB 0.0309  GB 0.0252 speedup 1.22736
d    0.038 MATLAB 0.0306  GB 0.0268 speedup 1.14303
d    0.040 MATLAB 0.0311  GB 0.0265 speedup 1.17067
d    0.042 MATLAB 0.0317  GB 0.0255 speedup 1.24566
d    0.044 MATLAB 0.0331  GB 0.0271 speedup 1.22016
d    0.046 MATLAB 0.0352  GB 0.0276 speedup 1.27608
d    0.048 MATLAB 0.0329  GB 0.0279 speedup 1.17886
d    0.050 MATLAB 0.0351  GB 0.0256 speedup 1.36946
d    0.052 MATLAB 0.0335  GB 0.0265 speedup 1.26376
d    0.054 MATLAB 0.0356  GB 0.0287 speedup 1.23965
d    0.056 MATLAB 0.0338  GB 0.0270 speedup 1.25094
d    0.058 MATLAB 0.0342  GB 0.0257 speedup 1.32893
d    0.060 MATLAB 0.0382  GB 0.0261 speedup 1.45976
d    0.062 MATLAB 0.0388  GB 0.0278 speedup 1.39686
d    0.064 MATLAB 0.0398  GB 0.0257 speedup 1.54847
d    0.066 MATLAB 0.0403  GB 0.0292 speedup 1.37962
d    0.068 MATLAB 0.0364  GB 0.0277 speedup 1.31678
d    0.070 MATLAB 0.0366  GB 0.0296 speedup 1.23532
d    0.072 MATLAB 0.0365  GB 0.0261 speedup 1.39551
d    0.074 MATLAB 0.0366  GB 0.0249 speedup 1.47128
d    0.076 MATLAB 0.0410  GB 0.0285 speedup 1.438
d    0.078 MATLAB 0.0368  GB 0.0277 speedup 1.32824
d    0.080 MATLAB 0.0413  GB 0.0323 speedup 1.27823
d    0.082 MATLAB 0.0402  GB 0.0292 speedup 1.37774
d    0.084 MATLAB 0.0421  GB 0.0261 speedup 1.61507
d    0.086 MATLAB 0.0412  GB 0.0263 speedup 1.56574
d    0.088 MATLAB 0.0400  GB 0.0282 speedup 1.41653
d    0.090 MATLAB 0.0398  GB 0.0281 speedup 1.41465
d    0.092 MATLAB 0.0394  GB 0.0274 speedup 1.43904
d    0.094 MATLAB 0.0398  GB 0.0281 speedup 1.41619
d    0.096 MATLAB 0.0400  GB 0.0318 speedup 1.25977
d    0.098 MATLAB 0.0430  GB 0.0330 speedup 1.3037
d    0.100 MATLAB 0.0447  GB 0.0280 speedup 1.59632

id  936 Matrix ND/nd3k
n 9000 edges 1635345
# triangles: 104073280
MATLAB:        Sandia     0.7968
GraphBLAS:     Sandia     0.5461
GraphBLAS:    SandiaL     0.5257
GraphBLAS:  SandiaDot     1.5846

id 2662 Matrix Freescale/Freescale2
n 2999349 edges 5744934
# triangles: 21027280
MATLAB:        Sandia     0.6631
GraphBLAS:     Sandia     0.7233
GraphBLAS:    SandiaL     0.2589
GraphBLAS:  SandiaDot     0.5201

id  936 Matrix ND/nd3k
n 9000 edges 1635345
triangles: 104073280
GraphBLAS outer product:       0.518999 sec (rate   3.15 million/sec)
GraphBLAS dot   product:       0.660700 sec (rate   2.48 million/sec)
nnz(L*L) 7.36943e+06 flops 4.70914e+08 memory 0.118127 (GB)
MATLAB (U*U).*U:               0.821741 sec (rate   1.99 million/sec)

id 2662 Matrix Freescale/Freescale2
n 2999349 edges 5744934
triangles: 21027280
GraphBLAS outer product:       0.410986 sec (rate  13.98 million/sec)
GraphBLAS dot   product:       0.340092 sec (rate  16.89 million/sec)
nnz(L*L) 1.56382e+07 flops 6.12483e+07 memory 0.322195 (GB)
MATLAB (U*U).*U:               0.653410 sec (rate   8.79 million/sec)

----------------- C=A*B performance

Prob = 

  struct with fields:

      name: 'VanVelzen/Zd_Jac2'
     title: 'Chemical process simulation, Nils van Velzen (complete problem)'
         A: [2283522835 double]
         b: [228351 double]
     Zeros: [2283522835 double]
        id: 1338
      date: '2006'
    author: 'N. van Velzen'
        ed: 'T. Davis'
      kind: 'chemical process simulation problem'

mxm, no mask 0.794953
mxm, no mask 0.796599
MATLAB, no mask 0.809977
MATLAB, mask 0.9314
mxm, with mask 0.140074
mxm, then emult 0.841126
mxm, with mask C 0.349855
mxm, with mask L 1.18589 (dot)

Prob = 

  struct with fields:

      name: 'Freescale/Freescale2'
     title: 'circuit simulation matrix from Freescale'
         A: [29993492999349 double]
     Zeros: [29993492999349 double]
        id: 2662
      date: '2015'
    author: 'K. Gullapalli'
        ed: 'T. Davis'
      kind: 'circuit simulation matrix'
     notes: [459 char]

C = A (1:      1:n, 1:      1:n) MATLAB     0.173291 GraphBLAS      0.20805 speedup   0.832933
C = A (1:      2:n, 1:      2:n) MATLAB      0.15479 GraphBLAS     0.215089 speedup   0.719653
C = A (1:      3:n, 1:      3:n) MATLAB    0.0981818 GraphBLAS     0.142037 speedup    0.69124
C = A (1:      4:n, 1:      4:n) MATLAB     0.084843 GraphBLAS     0.107193 speedup   0.791498
C = A (1:      5:n, 1:      5:n) MATLAB    0.0696544 GraphBLAS    0.0935828 speedup   0.744308
C = A (1:      6:n, 1:      6:n) MATLAB    0.0642494 GraphBLAS    0.0688869 speedup    0.93268
C = A (1:      7:n, 1:      7:n) MATLAB    0.0616821 GraphBLAS     0.072195 speedup   0.854382
C = A (1:      8:n, 1:      8:n) MATLAB    0.0693942 GraphBLAS    0.0584242 speedup    1.18776
C = A (1:      9:n, 1:      9:n) MATLAB    0.0693866 GraphBLAS    0.0629076 speedup    1.10299
C = A (1:     10:n, 1:     10:n) MATLAB    0.0522629 GraphBLAS    0.0586421 speedup   0.891219
C = A (1:     16:n, 1:     16:n) MATLAB    0.0320761 GraphBLAS    0.0454362 speedup   0.705959
C = A (1:     64:n, 1:     64:n) MATLAB    0.0262321 GraphBLAS     0.020932 speedup    1.25321
C = A (1:    128:n, 1:    128:n) MATLAB     0.018789 GraphBLAS   0.00597443 speedup     3.1449
C = A (1:    256:n, 1:    256:n) MATLAB    0.0238975 GraphBLAS   0.00353383 speedup    6.76248
C = A (1:   1024:n, 1:   1024:n) MATLAB     0.010502 GraphBLAS   0.00109782 speedup    9.56628
C = A (1: 100000:n, 1: 100000:n) MATLAB   0.00776344 GraphBLAS   4.9516e-05 speedup    156.786
C = A (1:1000000:n, 1:1000000:n) MATLAB   0.00768535 GraphBLAS   1.9831e-05 speedup    387.542
C = A (1:2000000:n, 1:2000000:n) MATLAB   0.00659694 GraphBLAS   2.4007e-05 speedup    274.792

C = A (1:      1, 1:      1) MATLAB    9.521e-05 GraphBLAS   1.7716e-05 speedup    5.37424
C = A (1:      2, 1:      2) MATLAB    0.0119068 GraphBLAS   2.2044e-05 speedup    540.138
C = A (1:      3, 1:      3) MATLAB   0.00770145 GraphBLAS   2.1357e-05 speedup    360.605
C = A (1:      4, 1:      4) MATLAB   0.00698876 GraphBLAS   2.8937e-05 speedup    241.517
C = A (1:      5, 1:      5) MATLAB    0.0133534 GraphBLAS  0.000107135 speedup    124.641
C = A (1:      6, 1:      6) MATLAB   0.00962427 GraphBLAS   2.0006e-05 speedup    481.069
C = A (1:      7, 1:      7) MATLAB   0.00823675 GraphBLAS    2.785e-05 speedup    295.754
C = A (1:      8, 1:      8) MATLAB   0.00975099 GraphBLAS    2.506e-05 speedup    389.106
C = A (1:      9, 1:      9) MATLAB   0.00903443 GraphBLAS   1.9615e-05 speedup    460.588
C = A (1:     10, 1:     10) MATLAB   0.00691993 GraphBLAS   1.6825e-05 speedup    411.289
C = A (1:     16, 1:     16) MATLAB   0.00752616 GraphBLAS   0.00010468 speedup    71.8968
C = A (1:     64, 1:     64) MATLAB   0.00864829 GraphBLAS   2.3372e-05 speedup    370.028
C = A (1:    128, 1:    128) MATLAB     0.007058 GraphBLAS   6.3021e-05 speedup    111.994
C = A (1:    256, 1:    256) MATLAB    0.0108712 GraphBLAS   5.3198e-05 speedup    204.353
C = A (1:   1024, 1:   1024) MATLAB    0.0092525 GraphBLAS    6.849e-05 speedup    135.093
C = A (1: 100000, 1: 100000) MATLAB    0.0138823 GraphBLAS    0.0234664 speedup   0.591583
C = A (1:1000000, 1:1000000) MATLAB      0.11963 GraphBLAS    0.0776153 speedup    1.54132
C = A (1:2000000, 1:2000000) MATLAB     0.188857 GraphBLAS     0.147913 speedup    1.27681

C = A (      1:  10001,       1:  10001) MATLAB    0.0243757 GraphBLAS  0.000831051 speedup    29.3312
C = A (      2:  10002,       2:  10002) MATLAB     0.010079 GraphBLAS  0.000712193 speedup     14.152
C = A (      3:  10003,       3:  10003) MATLAB   0.00828761 GraphBLAS  0.000457345 speedup    18.1211
C = A (      4:  10004,       4:  10004) MATLAB    0.0223314 GraphBLAS  0.000418988 speedup    53.2985
C = A (      5:  10005,       5:  10005) MATLAB    0.0111447 GraphBLAS  0.000447492 speedup    24.9048
C = A (      6:  10006,       6:  10006) MATLAB   0.00811767 GraphBLAS   0.00038917 speedup    20.8589
C = A (      7:  10007,       7:  10007) MATLAB    0.0176836 GraphBLAS  0.000561987 speedup    31.4661
C = A (      8:  10008,       8:  10008) MATLAB    0.0105118 GraphBLAS  0.000989427 speedup    10.6241
C = A (      9:  10009,       9:  10009) MATLAB    0.0122058 GraphBLAS  0.000418954 speedup     29.134
C = A (     10:  10010,      10:  10010) MATLAB    0.0177553 GraphBLAS  0.000391101 speedup    45.3984
C = A (     16:  10016,      16:  10016) MATLAB    0.0110992 GraphBLAS  0.000430581 speedup    25.7773
C = A (     64:  10064,      64:  10064) MATLAB   0.00888098 GraphBLAS   0.00103049 speedup    8.61818
C = A (    128:  10128,     128:  10128) MATLAB   0.00994514 GraphBLAS  0.000529411 speedup    18.7853
C = A (    256:  10256,     256:  10256) MATLAB    0.0100965 GraphBLAS   0.00042384 speedup    23.8215
C = A (   1024:  11024,    1024:  11024) MATLAB   0.00736143 GraphBLAS  0.000377053 speedup    19.5236
C = A ( 100000: 110000,  100000: 110000) MATLAB    0.0218229 GraphBLAS  0.000551272 speedup    39.5864
C = A (1000000:1010000, 1000000:1010000) MATLAB   0.00950458 GraphBLAS  0.000422747 speedup    22.4829
C = A (2000000:2010000, 2000000:2010000) MATLAB   0.00782876 GraphBLAS  0.000452391 speedup    17.3053

C = A (  10001:-1:      1,   10001:-1:      1) MATLAB    0.0287001 GraphBLAS   0.00115705 speedup    24.8046
C = A (  10002:-1:      2,   10002:-1:      2) MATLAB    0.0122546 GraphBLAS  0.000822836 speedup    14.8931
C = A (  10003:-1:      3,   10003:-1:      3) MATLAB   0.00909086 GraphBLAS    0.0120126 speedup   0.756778
C = A (  10004:-1:      4,   10004:-1:      4) MATLAB    0.0120608 GraphBLAS  0.000829357 speedup    14.5423
C = A (  10005:-1:      5,   10005:-1:      5) MATLAB    0.0327573 GraphBLAS   0.00137469 speedup    23.8289
C = A (  10006:-1:      6,   10006:-1:      6) MATLAB   0.00883085 GraphBLAS   0.00103942 speedup    8.49593
C = A (  10007:-1:      7,   10007:-1:      7) MATLAB     0.014831 GraphBLAS   0.00648126 speedup    2.28829
C = A (  10008:-1:      8,   10008:-1:      8) MATLAB    0.0127387 GraphBLAS   0.00219263 speedup     5.8098
C = A (  10009:-1:      9,   10009:-1:      9) MATLAB   0.00766316 GraphBLAS   0.00073869 speedup     10.374
C = A (  10010:-1:     10,   10010:-1:     10) MATLAB    0.0101912 GraphBLAS  0.000830426 speedup    12.2722
C = A (  10016:-1:     16,   10016:-1:     16) MATLAB   0.00977392 GraphBLAS  0.000761844 speedup    12.8293
C = A (  10064:-1:     64,   10064:-1:     64) MATLAB   0.00794803 GraphBLAS   0.00072859 speedup    10.9088
C = A (  10128:-1:    128,   10128:-1:    128) MATLAB    0.0102594 GraphBLAS   0.00245542 speedup    4.17827
C = A (  10256:-1:    256,   10256:-1:    256) MATLAB    0.0134829 GraphBLAS  0.000977794 speedup    13.7891
C = A (  11024:-1:   1024,   11024:-1:   1024) MATLAB    0.0095101 GraphBLAS   0.00131241 speedup    7.24629
C = A ( 110000:-1: 100000,  110000:-1: 100000) MATLAB    0.0109174 GraphBLAS  0.000648092 speedup    16.8454
C = A (1010000:-1:1000000, 1010000:-1:1000000) MATLAB   0.00766105 GraphBLAS  0.000915669 speedup    8.36662
C = A (2010000:-1:2000000, 2010000:-1:2000000) MATLAB   0.00978818 GraphBLAS  0.000469915 speedup    20.8297

C = A (n:     -1:1, n:     -1:1) MATLAB     0.219123 GraphBLAS     0.294869 speedup   0.743121
C = A (n:     -2:1, n:     -2:1) MATLAB     0.169658 GraphBLAS     0.218142 speedup   0.777739
C = A (n:     -3:1, n:     -3:1) MATLAB     0.105259 GraphBLAS      0.14795 speedup   0.711448
C = A (n:     -4:1, n:     -4:1) MATLAB    0.0742496 GraphBLAS     0.121394 speedup   0.611643
C = A (n:     -5:1, n:     -5:1) MATLAB    0.0724655 GraphBLAS     0.101368 speedup   0.714877
C = A (n:     -6:1, n:     -6:1) MATLAB    0.0666508 GraphBLAS    0.0749056 speedup   0.889798
C = A (n:     -7:1, n:     -7:1) MATLAB    0.0661382 GraphBLAS    0.0762258 speedup   0.867662
C = A (n:     -8:1, n:     -8:1) MATLAB     0.068863 GraphBLAS    0.0715284 speedup   0.962736
C = A (n:     -9:1, n:     -9:1) MATLAB    0.0662245 GraphBLAS    0.0630645 speedup    1.05011
C = A (n:    -10:1, n:    -10:1) MATLAB    0.0532289 GraphBLAS    0.0567257 speedup   0.938355
C = A (n:    -16:1, n:    -16:1) MATLAB    0.0374859 GraphBLAS    0.0461038 speedup   0.813077
C = A (n:    -64:1, n:    -64:1) MATLAB    0.0202315 GraphBLAS    0.0112999 speedup    1.79042
C = A (n:   -128:1, n:   -128:1) MATLAB    0.0198492 GraphBLAS   0.00711748 speedup     2.7888
C = A (n:   -256:1, n:   -256:1) MATLAB     0.035261 GraphBLAS   0.00612462 speedup    5.75726
C = A (n:  -1024:1, n:  -1024:1) MATLAB    0.0124853 GraphBLAS   0.00119042 speedup    10.4881
C = A (n:-100000:1, n:-100000:1) MATLAB   0.00769285 GraphBLAS  0.000193459 speedup    39.7647
C = A (n:-1000000:1, n:-1000000:1) MATLAB    0.0118731 GraphBLAS   2.4864e-05 speedup    477.523
C = A (n:-2000000:1, n:-2000000:1) MATLAB   0.00813452 GraphBLAS   3.3998e-05 speedup    239.265

----------------------- AdotB versus AxB

ans =

    'AdotB'


C =

   (1,1)       4.3950


ans =

    'did AdotB'


ans =

   All zero sparse: 11


C =

   (1,1)      -1.9234


building random sparse matrices 10000000 by M

m   1 n   1   9.99e-08 MATLAB:     0.0002 AdotB :     0.0003 GB,auto::     0.0021(d) outer     0.0940 rel:     0.0027  speedup:     0.0937
m   1 n  10   9.99e-07 MATLAB:     0.0011 AdotB :     0.0011 GB,auto::     0.0009(d) outer     0.0695 rel:     0.0155  speedup:     1.1595
m   1 n  20   2.00e-06 MATLAB:     0.0022 AdotB :     0.0019 GB,auto::     0.0019(d) outer     0.0703 rel:     0.0275  speedup:     1.1649
m   1 n  30   3.00e-06 MATLAB:     0.0034 AdotB :     0.0030 GB,auto::     0.0032(d) outer     0.0726 rel:     0.0409  speedup:     1.0425
m   1 n  40   4.00e-06 MATLAB:     0.0043 AdotB :     0.0042 GB,auto::     0.0040(d) outer     0.0770 rel:     0.0547  speedup:     1.0603
m   1 n  50   5.00e-06 MATLAB:     0.0050 AdotB :     0.0047 GB,auto::     0.0046(d) outer     0.0724 rel:     0.0650  speedup:     1.0895
m   1 n  60   5.99e-06 MATLAB:     0.0061 AdotB :     0.0059 GB,auto::     0.0056(d) outer     0.0740 rel:     0.0792  speedup:     1.0948
m   1 n  61   6.09e-06 MATLAB:     0.0062 AdotB :     0.0061 GB,auto::     0.0059(d) outer     0.0746 rel:     0.0812  speedup:     1.0498
m   1 n  62   6.19e-06 MATLAB:     0.0068 AdotB :     0.0059 GB,auto::     0.0059(d) outer     0.0762 rel:     0.0770  speedup:     1.1536
m   1 n  63   6.29e-06 MATLAB:     0.0065 AdotB :     0.0059 GB,auto::     0.0059(d) outer     0.0783 rel:     0.0757  speedup:     1.1044
m   1 n  64   6.39e-06 MATLAB:     0.0669 AdotB :     0.0067 GB,auto::     0.0066(d) outer     0.0562 rel:     0.1185  speedup:    10.1044
m   1 n  65   6.49e-06 MATLAB:     0.0656 AdotB :     0.0062 GB,auto::     0.0080(d) outer     0.0530 rel:     0.1178  speedup:     8.1798
m   1 n  70   6.99e-06 MATLAB:     0.0671 AdotB :     0.0069 GB,auto::     0.0090(d) outer     0.0520 rel:     0.1322  speedup:     7.4696
m   1 n  80   7.99e-06 MATLAB:     0.0691 AdotB :     0.0076 GB,auto::     0.0081(d) outer     0.0534 rel:     0.1430  speedup:     8.4964
m   1 n  90   8.99e-06 MATLAB:     0.0672 AdotB :     0.0111 GB,auto::     0.0091(d) outer     0.0532 rel:     0.2092  speedup:     7.3602
m   1 n 100   9.99e-06 MATLAB:     0.0707 AdotB :     0.0134 GB,auto::     0.0094(d) outer     0.0569 rel:     0.2351  speedup:     7.4805

m  10 n   1   9.99e-07 MATLAB:     0.0013 AdotB :     0.0013 GB,auto::     0.0009(d) outer     0.0816 rel:     0.0156  speedup:     1.4547
m  10 n  10   9.90e-06 MATLAB:     0.0160 AdotB :     0.0101 GB,auto::     0.0124(G) outer     0.0554 rel:     0.1828  speedup:     1.2851
m  10 n  20   1.98e-05 MATLAB:     0.0207 AdotB :     0.0191 GB,auto::     0.0173(G) outer     0.0777 rel:     0.2465  speedup:     1.1990
m  10 n  30   2.97e-05 MATLAB:     0.0309 AdotB :     0.0275 GB,auto::     0.0264(G) outer     0.0862 rel:     0.3187  speedup:     1.1691
m  10 n  40   3.96e-05 MATLAB:     0.0424 AdotB :     0.0381 GB,auto::     0.0279(G) outer     0.0834 rel:     0.4574  speedup:     1.5224
m  10 n  50   4.95e-05 MATLAB:     0.0500 AdotB :     0.0460 GB,auto::     0.0324(G) outer     0.0886 rel:     0.5196  speedup:     1.5456
m  10 n  60   5.94e-05 MATLAB:     0.0591 AdotB :     0.0583 GB,auto::     0.0395(G) outer     0.0920 rel:     0.6330  speedup:     1.4965
m  10 n  61   6.04e-05 MATLAB:     0.0675 AdotB :     0.0606 GB,auto::     0.0389(G) outer     0.0903 rel:     0.6718  speedup:     1.7349
m  10 n  62   6.14e-05 MATLAB:     0.0635 AdotB :     0.0570 GB,auto::     0.0396(G) outer     0.0894 rel:     0.6379  speedup:     1.6064
m  10 n  63   6.24e-05 MATLAB:     0.0622 AdotB :     0.0612 GB,auto::     0.0418(G) outer     0.0875 rel:     0.6998  speedup:     1.4882
m  10 n  64   6.34e-05 MATLAB:     0.0812 AdotB :     0.0617 GB,auto::     0.0401(G) outer     0.0680 rel:     0.9073  speedup:     2.0235
m  10 n  65   6.44e-05 MATLAB:     0.0793 AdotB :     0.0632 GB,auto::     0.0408(G) outer     0.0627 rel:     1.0089  speedup:     1.9450
m  10 n  70   6.93e-05 MATLAB:     0.0764 AdotB :     0.0671 GB,auto::     0.0437(G) outer     0.0626 rel:     1.0711  speedup:     1.7465
m  10 n  80   7.92e-05 MATLAB:     0.0858 AdotB :     0.0754 GB,auto::     0.0512(G) outer     0.0674 rel:     1.1190  speedup:     1.6763
m  10 n  90   8.91e-05 MATLAB:     0.0964 AdotB :     0.0856 GB,auto::     0.0562(G) outer     0.0716 rel:     1.1968  speedup:     1.7149
m  10 n 100   9.90e-05 MATLAB:     0.0843 AdotB :     0.0936 GB,auto::     0.0657(G) outer     0.0705 rel:     1.3286  speedup:     1.2829

m  20 n   1   2.00e-06 MATLAB:     0.0023 AdotB :     0.0022 GB,auto::     0.0018(d) outer     0.0904 rel:     0.0238  speedup:     1.2370
m  20 n  10   1.98e-05 MATLAB:     0.0204 AdotB :     0.0194 GB,auto::     0.0193(G) outer     0.0901 rel:     0.2150  speedup:     1.0609
m  20 n  20   3.92e-05 MATLAB:     0.0396 AdotB :     0.0371 GB,auto::     0.0269(G) outer     0.0698 rel:     0.5316  speedup:     1.4717
m  20 n  30   5.88e-05 MATLAB:     0.0604 AdotB :     0.0583 GB,auto::     0.0336(G) outer     0.0947 rel:     0.6152  speedup:     1.7967
m  20 n  40   7.84e-05 MATLAB:     0.0787 AdotB :     0.0754 GB,auto::     0.0472(G) outer     0.1047 rel:     0.7196  speedup:     1.6693
m  20 n  50   9.80e-05 MATLAB:     0.1000 AdotB :     0.0916 GB,auto::     0.0480(G) outer     0.0962 rel:     0.9527  speedup:     2.0818
m  20 n  60   1.18e-04 MATLAB:     0.1235 AdotB :     0.1100 GB,auto::     0.0563(G) outer     0.1023 rel:     1.0756  speedup:     2.1929
m  20 n  61   1.20e-04 MATLAB:     0.1285 AdotB :     0.1142 GB,auto::     0.0578(G) outer     0.0989 rel:     1.1553  speedup:     2.2231
m  20 n  62   1.22e-04 MATLAB:     0.1249 AdotB :     0.1176 GB,auto::     0.0590(G) outer     0.0987 rel:     1.1916  speedup:     2.1169
m  20 n  63   1.24e-04 MATLAB:     0.1271 AdotB :     0.1211 GB,auto::     0.0623(G) outer     0.1003 rel:     1.2070  speedup:     2.0394
m  20 n  64   1.25e-04 MATLAB:     0.0921 AdotB :     0.1237 GB,auto::     0.0599(G) outer     0.0780 rel:     1.5862  speedup:     1.5365
m  20 n  65   1.27e-04 MATLAB:     0.0893 AdotB :     0.1218 GB,auto::     0.0608(G) outer     0.0769 rel:     1.5837  speedup:     1.4669
m  20 n  70   1.37e-04 MATLAB:     0.0932 AdotB :     0.1330 GB,auto::     0.0603(G) outer     0.0760 rel:     1.7494  speedup:     1.5460
m  20 n  80   1.57e-04 MATLAB:     0.0957 AdotB :     0.1517 GB,auto::     0.0738(G) outer     0.0788 rel:     1.9243  speedup:     1.2976
m  20 n  90   1.76e-04 MATLAB:     0.0974 AdotB :     0.1663 GB,auto::     0.0750(G) outer     0.0804 rel:     2.0678  speedup:     1.2987
m  20 n 100   1.96e-04 MATLAB:     0.0963 AdotB :     0.1915 GB,auto::     0.0816(G) outer     0.0845 rel:     2.2672  speedup:     1.1807

m  30 n   1   3.00e-06 MATLAB:     0.0031 AdotB :     0.0028 GB,auto::     0.0027(d) outer     0.0984 rel:     0.0287  speedup:     1.1346
m  30 n  10   2.97e-05 MATLAB:     0.0314 AdotB :     0.0281 GB,auto::     0.0323(G) outer     0.1009 rel:     0.2788  speedup:     0.9749
m  30 n  20   5.88e-05 MATLAB:     0.0604 AdotB :     0.0579 GB,auto::     0.0376(G) outer     0.1036 rel:     0.5593  speedup:     1.6060
m  30 n  30   8.74e-05 MATLAB:     0.0980 AdotB :     0.0872 GB,auto::     0.0492(G) outer     0.0820 rel:     1.0634  speedup:     1.9929
m  30 n  40   1.17e-04 MATLAB:     0.1184 AdotB :     0.1110 GB,auto::     0.0533(G) outer     0.1217 rel:     0.9117  speedup:     2.2235
m  30 n  50   1.46e-04 MATLAB:     0.1547 AdotB :     0.1470 GB,auto::     0.0590(G) outer     0.1163 rel:     1.2639  speedup:     2.6205
m  30 n  60   1.75e-04 MATLAB:     0.1872 AdotB :     0.1674 GB,auto::     0.0796(G) outer     0.1203 rel:     1.3915  speedup:     2.3533
m  30 n  61   1.78e-04 MATLAB:     0.1805 AdotB :     0.1737 GB,auto::     0.0768(G) outer     0.1133 rel:     1.5325  speedup:     2.3499
m  30 n  62   1.81e-04 MATLAB:     0.1926 AdotB :     0.1744 GB,auto::     0.0780(G) outer     0.1174 rel:     1.4851  speedup:     2.4680
m  30 n  63   1.83e-04 MATLAB:     0.1875 AdotB :     0.1733 GB,auto::     0.0819(G) outer     0.1178 rel:     1.4709  speedup:     2.2884
m  30 n  64   1.86e-04 MATLAB:     0.1018 AdotB :     0.1802 GB,auto::     0.0813(G) outer     0.0907 rel:     1.9870  speedup:     1.2526
m  30 n  65   1.89e-04 MATLAB:     0.1057 AdotB :     0.1950 GB,auto::     0.0875(G) outer     0.0890 rel:     2.1911  speedup:     1.2087
m  30 n  70   2.04e-04 MATLAB:     0.1033 AdotB :     0.1961 GB,auto::     0.0865(G) outer     0.0932 rel:     2.1046  speedup:     1.1942
m  30 n  80   2.33e-04 MATLAB:     0.1054 AdotB :     0.2241 GB,auto::     0.0972(G) outer     0.0916 rel:     2.4469  speedup:     1.0849
m  30 n  90   2.62e-04 MATLAB:     0.1081 AdotB :     0.2632 GB,auto::     0.1082(G) outer     0.0991 rel:     2.6561  speedup:     0.9993
m  30 n 100   2.91e-04 MATLAB:     0.1120 AdotB :     0.2908 GB,auto::     0.1119(G) outer     0.0975 rel:     2.9824  speedup:     1.0007

m  40 n   1   4.00e-06 MATLAB:     0.0043 AdotB :     0.0040 GB,auto::     0.0039(d) outer     0.1309 rel:     0.0309  speedup:     1.1053
m  40 n  10   3.96e-05 MATLAB:     0.0407 AdotB :     0.0404 GB,auto::     0.0416(G) outer     0.1110 rel:     0.3639  speedup:     0.9775
m  40 n  20   7.84e-05 MATLAB:     0.0814 AdotB :     0.0763 GB,auto::     0.0488(G) outer     0.1206 rel:     0.6325  speedup:     1.6677
m  40 n  30   1.17e-04 MATLAB:     0.1235 AdotB :     0.1195 GB,auto::     0.0529(G) outer     0.0964 rel:     1.2402  speedup:     2.3337
m  40 n  40   1.54e-04 MATLAB:     0.1625 AdotB :     0.1551 GB,auto::     0.0645(G) outer     0.0998 rel:     1.5548  speedup:     2.5193
m  40 n  50   1.92e-04 MATLAB:     0.2088 AdotB :     0.1858 GB,auto::     0.0735(G) outer     0.1217 rel:     1.5273  speedup:     2.8412
m  40 n  60   2.31e-04 MATLAB:     0.2479 AdotB :     0.2363 GB,auto::     0.0830(G) outer     0.1246 rel:     1.8967  speedup:     2.9876
m  40 n  61   2.35e-04 MATLAB:     0.2643 AdotB :     0.2522 GB,auto::     0.1134(G) outer     0.1327 rel:     1.9008  speedup:     2.3302
m  40 n  62   2.38e-04 MATLAB:     0.2633 AdotB :     0.2382 GB,auto::     0.0928(G) outer     0.1299 rel:     1.8328  speedup:     2.8360
m  40 n  63   2.42e-04 MATLAB:     0.2618 AdotB :     0.2421 GB,auto::     0.1042(G) outer     0.1276 rel:     1.8980  speedup:     2.5129
m  40 n  64   2.46e-04 MATLAB:     0.1193 AdotB :     0.2444 GB,auto::     0.0869(G) outer     0.1022 rel:     2.3905  speedup:     1.3724
m  40 n  65   2.50e-04 MATLAB:     0.1165 AdotB :     0.2550 GB,auto::     0.0898(G) outer     0.1029 rel:     2.4792  speedup:     1.2965
m  40 n  70   2.69e-04 MATLAB:     0.1179 AdotB :     0.2635 GB,auto::     0.0939(G) outer     0.1050 rel:     2.5111  speedup:     1.2555
m  40 n  80   3.08e-04 MATLAB:     0.1235 AdotB :     0.3206 GB,auto::     0.1323(G) outer     0.1048 rel:     3.0594  speedup:     0.9334
m  40 n  90   3.46e-04 MATLAB:     0.1258 AdotB :     0.3457 GB,auto::     0.1376(G) outer     0.1070 rel:     3.2305  speedup:     0.9143
m  40 n 100   3.85e-04 MATLAB:     0.1276 AdotB :     0.3949 GB,auto::     0.1574(G) outer     0.1117 rel:     3.5359  speedup:     0.8106

m  50 n   1   5.00e-06 MATLAB:     0.0051 AdotB :     0.0047 GB,auto::     0.0046(d) outer     0.1279 rel:     0.0369  speedup:     1.1106
m  50 n  10   4.95e-05 MATLAB:     0.0510 AdotB :     0.0500 GB,auto::     0.0486(G) outer     0.1300 rel:     0.3847  speedup:     1.0486
m  50 n  20   9.80e-05 MATLAB:     0.0992 AdotB :     0.0947 GB,auto::     0.0627(G) outer     0.1361 rel:     0.6955  speedup:     1.5827
m  50 n  30   1.46e-04 MATLAB:     0.1535 AdotB :     0.1440 GB,auto::     0.0695(G) outer     0.1314 rel:     1.0960  speedup:     2.2088
m  50 n  40   1.92e-04 MATLAB:     0.2171 AdotB :     0.1967 GB,auto::     0.0819(G) outer     0.1357 rel:     1.4496  speedup:     2.6505
m  50 n  50   2.38e-04 MATLAB:     0.2571 AdotB :     0.2370 GB,auto::     0.1025(G) outer     0.1178 rel:     2.0123  speedup:     2.5076
m  50 n  60   2.86e-04 MATLAB:     0.3004 AdotB :     0.2929 GB,auto::     0.0989(G) outer     0.1358 rel:     2.1571  speedup:     3.0385
m  50 n  61   2.90e-04 MATLAB:     0.3221 AdotB :     0.2951 GB,auto::     0.1472(G) outer     0.1491 rel:     1.9794  speedup:     2.1889
m  50 n  62   2.95e-04 MATLAB:     0.3264 AdotB :     0.3050 GB,auto::     0.1128(G) outer     0.1371 rel:     2.2237  speedup:     2.8923
m  50 n  63   3.00e-04 MATLAB:     0.3244 AdotB :     0.3046 GB,auto::     0.1047(G) outer     0.1421 rel:     2.1437  speedup:     3.0985
m  50 n  64   3.05e-04 MATLAB:     0.1270 AdotB :     0.3049 GB,auto::     0.1178(G) outer     0.1177 rel:     2.5915  speedup:     1.0785
m  50 n  65   3.10e-04 MATLAB:     0.1279 AdotB :     0.3179 GB,auto::     0.1095(G) outer     0.1172 rel:     2.7119  speedup:     1.1676
m  50 n  70   3.33e-04 MATLAB:     0.1353 AdotB :     0.3262 GB,auto::     0.1119(G) outer     0.1181 rel:     2.7618  speedup:     1.2098
m  50 n  80   3.81e-04 MATLAB:     0.1331 AdotB :     0.3883 GB,auto::     0.1981(G) outer     0.1248 rel:     3.1124  speedup:     0.6719
m  50 n  90   4.29e-04 MATLAB:     0.1405 AdotB :     0.4511 GB,auto::     0.1359(G) outer     0.1220 rel:     3.6976  speedup:     1.0338
m  50 n 100   4.76e-04 MATLAB:     0.1428 AdotB :     0.4716 GB,auto::     0.1564(G) outer     0.1225 rel:     3.8484  speedup:     0.9134

m  60 n   1   5.99e-06 MATLAB:     0.0061 AdotB :     0.0057 GB,auto::     0.0057(d) outer     0.1342 rel:     0.0423  speedup:     1.0661
m  60 n  10   5.94e-05 MATLAB:     0.0631 AdotB :     0.0621 GB,auto::     0.0631(G) outer     0.1341 rel:     0.4634  speedup:     0.9993
m  60 n  20   1.18e-04 MATLAB:     0.1261 AdotB :     0.1136 GB,auto::     0.0752(G) outer     0.1387 rel:     0.8186  speedup:     1.6766
m  60 n  30   1.75e-04 MATLAB:     0.1940 AdotB :     0.1781 GB,auto::     0.0844(G) outer     0.1480 rel:     1.2036  speedup:     2.2978
m  60 n  40   2.31e-04 MATLAB:     0.2561 AdotB :     0.2249 GB,auto::     0.0975(G) outer     0.1476 rel:     1.5237  speedup:     2.6270
m  60 n  50   2.86e-04 MATLAB:     0.3032 AdotB :     0.2882 GB,auto::     0.1161(G) outer     0.1277 rel:     2.2569  speedup:     2.6114
m  60 n  60   3.40e-04 MATLAB:     0.3812 AdotB :     0.3676 GB,auto::     0.1367(G) outer     0.1299 rel:     2.8298  speedup:     2.7873
m  60 n  61   3.45e-04 MATLAB:     0.3814 AdotB :     0.3507 GB,auto::     0.1404(G) outer     0.1537 rel:     2.2824  speedup:     2.7168
m  60 n  62   3.51e-04 MATLAB:     0.3899 AdotB :     0.3643 GB,auto::     0.1279(G) outer     0.1506 rel:     2.4186  speedup:     3.0487
m  60 n  63   3.57e-04 MATLAB:     0.3906 AdotB :     0.3667 GB,auto::     0.1274(G) outer     0.1527 rel:     2.4012  speedup:     3.0660
m  60 n  64   3.62e-04 MATLAB:     0.1489 AdotB :     0.3859 GB,auto::     0.1352(G) outer     0.1309 rel:     2.9486  speedup:     1.1011
m  60 n  65   3.68e-04 MATLAB:     0.1468 AdotB :     0.3765 GB,auto::     0.1317(G) outer     0.1289 rel:     2.9204  speedup:     1.1145
m  60 n  70   3.96e-04 MATLAB:     0.1466 AdotB :     0.3979 GB,auto::     0.1454(G) outer     0.1305 rel:     3.0487  speedup:     1.0077
m  60 n  80   4.53e-04 MATLAB:     0.1475 AdotB :     0.4660 GB,auto::     0.1529(G) outer     0.1323 rel:     3.5221  speedup:     0.9650
m  60 n  90   5.09e-04 MATLAB:     0.1515 AdotB :     0.5107 GB,auto::     0.1938(G) outer     0.1358 rel:     3.7606  speedup:     0.7817
m  60 n 100   5.66e-04 MATLAB:     0.1530 AdotB :     0.5678 GB,auto::     0.1723(G) outer     0.1422 rel:     3.9936  speedup:     0.8883

m  61 n   1   6.09e-06 MATLAB:     0.0062 AdotB :     0.0058 GB,auto::     0.0056(d) outer     0.1341 rel:     0.0431  speedup:     1.0984
m  61 n  10   6.04e-05 MATLAB:     0.0631 AdotB :     0.0602 GB,auto::     0.0687(G) outer     0.1446 rel:     0.4163  speedup:     0.9185
m  61 n  20   1.20e-04 MATLAB:     0.1250 AdotB :     0.1237 GB,auto::     0.0801(G) outer     0.1448 rel:     0.8538  speedup:     1.5600
m  61 n  30   1.78e-04 MATLAB:     0.2030 AdotB :     0.1823 GB,auto::     0.0933(G) outer     0.1432 rel:     1.2732  speedup:     2.1759
m  61 n  40   2.35e-04 MATLAB:     0.2566 AdotB :     0.2424 GB,auto::     0.1012(G) outer     0.1507 rel:     1.6082  speedup:     2.5364
m  61 n  50   2.90e-04 MATLAB:     0.3144 AdotB :     0.2860 GB,auto::     0.1161(G) outer     0.1519 rel:     1.8828  speedup:     2.7091
m  61 n  60   3.45e-04 MATLAB:     0.3819 AdotB :     0.3593 GB,auto::     0.1272(G) outer     0.1336 rel:     2.6892  speedup:     3.0022
m  61 n  61   3.51e-04 MATLAB:     0.3864 AdotB :     0.3568 GB,auto::     0.1340(G) outer     0.1447 rel:     2.4651  speedup:     2.8841
m  61 n  62   3.56e-04 MATLAB:     0.3872 AdotB :     0.3590 GB,auto::     0.1258(G) outer     0.1535 rel:     2.3397  speedup:     3.0791
m  61 n  63   3.62e-04 MATLAB:     0.4036 AdotB :     0.3826 GB,auto::     0.1411(G) outer     0.1553 rel:     2.4641  speedup:     2.8601
m  61 n  64   3.68e-04 MATLAB:     0.1551 AdotB :     0.3786 GB,auto::     0.1342(G) outer     0.1306 rel:     2.8984  speedup:     1.1555
m  61 n  65   3.74e-04 MATLAB:     0.1425 AdotB :     0.3781 GB,auto::     0.1521(G) outer     0.1315 rel:     2.8761  speedup:     0.9370
m  61 n  70   4.02e-04 MATLAB:     0.1481 AdotB :     0.4082 GB,auto::     0.1457(G) outer     0.1321 rel:     3.0901  speedup:     1.0163
m  61 n  80   4.60e-04 MATLAB:     0.1459 AdotB :     0.4707 GB,auto::     0.1632(G) outer     0.1349 rel:     3.4895  speedup:     0.8942
m  61 n  90   5.17e-04 MATLAB:     0.1540 AdotB :     0.5162 GB,auto::     0.1682(G) outer     0.1389 rel:     3.7159  speedup:     0.9160
m  61 n 100   5.75e-04 MATLAB:     0.1641 AdotB :     0.5937 GB,auto::     0.1776(G) outer     0.1391 rel:     4.2697  speedup:     0.9237

m  62 n   1   6.19e-06 MATLAB:     0.0063 AdotB :     0.0058 GB,auto::     0.0059(d) outer     0.1369 rel:     0.0425  speedup:     1.0697
m  62 n  10   6.14e-05 MATLAB:     0.0632 AdotB :     0.0639 GB,auto::     0.0723(G) outer     0.1436 rel:     0.4446  speedup:     0.8738
m  62 n  20   1.22e-04 MATLAB:     0.1280 AdotB :     0.1172 GB,auto::     0.0747(G) outer     0.1561 rel:     0.7505  speedup:     1.7140
m  62 n  30   1.81e-04 MATLAB:     0.1859 AdotB :     0.1768 GB,auto::     0.0974(G) outer     0.1605 rel:     1.1016  speedup:     1.9090
m  62 n  40   2.38e-04 MATLAB:     0.2631 AdotB :     0.2434 GB,auto::     0.1080(G) outer     0.1510 rel:     1.6116  speedup:     2.4350
m  62 n  50   2.95e-04 MATLAB:     0.3470 AdotB :     0.2997 GB,auto::     0.1226(G) outer     0.1542 rel:     1.9438  speedup:     2.8315
m  62 n  60   3.51e-04 MATLAB:     0.5090 AdotB :     0.4688 GB,auto::     0.1531(G) outer     0.1299 rel:     3.6097  speedup:     3.3244
m  62 n  61   3.56e-04 MATLAB:     0.3925 AdotB :     0.3728 GB,auto::     0.1322(G) outer     0.1314 rel:     2.8365  speedup:     2.9681
m  62 n  62   3.62e-04 MATLAB:     0.4110 AdotB :     0.3805 GB,auto::     0.1481(G) outer     0.1454 rel:     2.6180  speedup:     2.7746
m  62 n  63   3.68e-04 MATLAB:     0.4058 AdotB :     0.3754 GB,auto::     0.1304(G) outer     0.1642 rel:     2.2864  speedup:     3.1108
m  62 n  64   3.74e-04 MATLAB:     0.1429 AdotB :     0.3783 GB,auto::     0.1585(G) outer     0.1394 rel:     2.7132  speedup:     0.9015
m  62 n  65   3.79e-04 MATLAB:     0.1476 AdotB :     0.3922 GB,auto::     0.1382(G) outer     0.1299 rel:     3.0201  speedup:     1.0677
m  62 n  70   4.09e-04 MATLAB:     0.1465 AdotB :     0.4126 GB,auto::     0.1444(G) outer     0.1313 rel:     3.1422  speedup:     1.0143
m  62 n  80   4.67e-04 MATLAB:     0.1520 AdotB :     0.4738 GB,auto::     0.1568(G) outer     0.1354 rel:     3.4995  speedup:     0.9694
m  62 n  90   5.25e-04 MATLAB:     0.1514 AdotB :     0.5429 GB,auto::     0.1699(G) outer     0.1405 rel:     3.8650  speedup:     0.8910
m  62 n 100   5.84e-04 MATLAB:     0.1594 AdotB :     0.5962 GB,auto::     0.1847(G) outer     0.1450 rel:     4.1123  speedup:     0.8627

m  63 n   1   6.29e-06 MATLAB:     0.0064 AdotB :     0.0059 GB,auto::     0.0059(d) outer     0.1376 rel:     0.0432  speedup:     1.0897
m  63 n  10   6.24e-05 MATLAB:     0.0662 AdotB :     0.0619 GB,auto::     0.0672(G) outer     0.1407 rel:     0.4397  speedup:     0.9854
m  63 n  20   1.24e-04 MATLAB:     0.1335 AdotB :     0.1243 GB,auto::     0.0735(G) outer     0.1486 rel:     0.8369  speedup:     1.8152
m  63 n  30   1.83e-04 MATLAB:     0.2021 AdotB :     0.1838 GB,auto::     0.0966(G) outer     0.1496 rel:     1.2282  speedup:     2.0915
m  63 n  40   2.42e-04 MATLAB:     0.2588 AdotB :     0.2357 GB,auto::     0.1053(G) outer     0.1466 rel:     1.6069  speedup:     2.4574
m  63 n  50   3.00e-04 MATLAB:     0.3229 AdotB :     0.3025 GB,auto::     0.1264(G) outer     0.1615 rel:     1.8735  speedup:     2.5542
m  63 n  60   3.57e-04 MATLAB:     0.3960 AdotB :     0.3627 GB,auto::     0.1279(G) outer     0.1350 rel:     2.6872  speedup:     3.0972
m  63 n  61   3.62e-04 MATLAB:     0.3971 AdotB :     0.3806 GB,auto::     0.1320(G) outer     0.1341 rel:     2.8388  speedup:     3.0072
m  63 n  62   3.68e-04 MATLAB:     0.4021 AdotB :     0.3837 GB,auto::     0.1495(G) outer     0.1326 rel:     2.8942  speedup:     2.6902
m  63 n  63   3.73e-04 MATLAB:     0.4151 AdotB :     0.3751 GB,auto::     0.1388(G) outer     0.1319 rel:     2.8442  speedup:     2.9902
m  63 n  64   3.79e-04 MATLAB:     0.1431 AdotB :     0.3869 GB,auto::     0.1332(G) outer     0.1328 rel:     2.9127  speedup:     1.0741
m  63 n  65   3.85e-04 MATLAB:     0.1512 AdotB :     0.4022 GB,auto::     0.1375(G) outer     0.1344 rel:     2.9927  speedup:     1.0996
m  63 n  70   4.15e-04 MATLAB:     0.1470 AdotB :     0.4269 GB,auto::     0.1456(G) outer     0.1361 rel:     3.1372  speedup:     1.0092
m  63 n  80   4.74e-04 MATLAB:     0.1554 AdotB :     0.4879 GB,auto::     0.1603(G) outer     0.1481 rel:     3.2939  speedup:     0.9689
m  63 n  90   5.33e-04 MATLAB:     0.1519 AdotB :     0.5451 GB,auto::     0.1706(G) outer     0.1388 rel:     3.9270  speedup:     0.8906
m  63 n 100   5.93e-04 MATLAB:     0.1610 AdotB :     0.6118 GB,auto::     0.1805(G) outer     0.1444 rel:     4.2378  speedup:     0.8923

m  64 n   1   6.39e-06 MATLAB:     0.1318 AdotB :     0.0061 GB,auto::     0.0060(d) outer     0.1167 rel:     0.0527  speedup:    22.0301
m  64 n  10   6.34e-05 MATLAB:     0.1320 AdotB :     0.0656 GB,auto::     0.0736(G) outer     0.1233 rel:     0.5322  speedup:     1.7933
m  64 n  20   1.25e-04 MATLAB:     0.1338 AdotB :     0.1223 GB,auto::     0.0786(G) outer     0.1199 rel:     1.0195  speedup:     1.7021
m  64 n  30   1.86e-04 MATLAB:     0.1389 AdotB :     0.1818 GB,auto::     0.0927(G) outer     0.1267 rel:     1.4343  speedup:     1.4979
m  64 n  40   2.46e-04 MATLAB:     0.1399 AdotB :     0.2410 GB,auto::     0.1168(G) outer     0.1399 rel:     1.7234  speedup:     1.1973
m  64 n  50   3.05e-04 MATLAB:     0.1418 AdotB :     0.3018 GB,auto::     0.1205(G) outer     0.1327 rel:     2.2749  speedup:     1.1766
m  64 n  60   3.62e-04 MATLAB:     0.1239 AdotB :     0.3895 GB,auto::     0.1329(G) outer     0.1339 rel:     2.9076  speedup:     0.9321
m  64 n  61   3.68e-04 MATLAB:     0.1220 AdotB :     0.3679 GB,auto::     0.1313(G) outer     0.1352 rel:     2.7214  speedup:     0.9292
m  64 n  62   3.74e-04 MATLAB:     0.1248 AdotB :     0.3914 GB,auto::     0.1385(G) outer     0.1335 rel:     2.9317  speedup:     0.9016
m  64 n  63   3.79e-04 MATLAB:     0.1339 AdotB :     0.3893 GB,auto::     0.1347(G) outer     0.1362 rel:     2.8582  speedup:     0.9945
m  64 n  64   3.85e-04 MATLAB:     0.1232 AdotB :     0.3921 GB,auto::     0.1438(G) outer     0.1385 rel:     2.8310  speedup:     0.8571
m  64 n  65   3.91e-04 MATLAB:     0.1543 AdotB :     0.4182 GB,auto::     0.1453(G) outer     0.1352 rel:     3.0931  speedup:     1.0624
m  64 n  70   4.21e-04 MATLAB:     0.1474 AdotB :     0.4281 GB,auto::     0.1461(G) outer     0.1369 rel:     3.1267  speedup:     1.0090
m  64 n  80   4.81e-04 MATLAB:     0.1587 AdotB :     0.4892 GB,auto::     0.1576(G) outer     0.1386 rel:     3.5289  speedup:     1.0070
m  64 n  90   5.41e-04 MATLAB:     0.1578 AdotB :     0.5684 GB,auto::     0.1742(G) outer     0.1439 rel:     3.9514  speedup:     0.9064
m  64 n 100   6.02e-04 MATLAB:     0.1587 AdotB :     0.6052 GB,auto::     0.1855(G) outer     0.1485 rel:     4.0758  speedup:     0.8554

m  65 n   1   6.49e-06 MATLAB:     0.1279 AdotB :     0.0063 GB,auto::     0.0060(d) outer     0.1186 rel:     0.0529  speedup:    21.2298
m  65 n  10   6.44e-05 MATLAB:     0.1402 AdotB :     0.0645 GB,auto::     0.0762(G) outer     0.1252 rel:     0.5155  speedup:     1.8387
m  65 n  20   1.27e-04 MATLAB:     0.1354 AdotB :     0.1251 GB,auto::     0.0834(G) outer     0.1257 rel:     0.9950  speedup:     1.6233
m  65 n  30   1.89e-04 MATLAB:     0.1391 AdotB :     0.1864 GB,auto::     0.0956(G) outer     0.1262 rel:     1.4774  speedup:     1.4544
m  65 n  40   2.50e-04 MATLAB:     0.1409 AdotB :     0.2545 GB,auto::     0.1031(G) outer     0.1300 rel:     1.9576  speedup:     1.3671
m  65 n  50   3.10e-04 MATLAB:     0.1455 AdotB :     0.3186 GB,auto::     0.1286(G) outer     0.1295 rel:     2.4597  speedup:     1.1318
m  65 n  60   3.68e-04 MATLAB:     0.1351 AdotB :     0.3697 GB,auto::     0.1330(G) outer     0.1346 rel:     2.7464  speedup:     1.0157
m  65 n  61   3.74e-04 MATLAB:     0.1224 AdotB :     0.3811 GB,auto::     0.1369(G) outer     0.1384 rel:     2.7545  speedup:     0.8937
m  65 n  62   3.79e-04 MATLAB:     0.1267 AdotB :     0.3863 GB,auto::     0.1435(G) outer     0.1363 rel:     2.8351  speedup:     0.8826
m  65 n  63   3.85e-04 MATLAB:     0.1272 AdotB :     0.3950 GB,auto::     0.1474(G) outer     0.1343 rel:     2.9414  speedup:     0.8632
m  65 n  64   3.91e-04 MATLAB:     0.1248 AdotB :     0.4001 GB,auto::     0.1476(G) outer     0.1367 rel:     2.9271  speedup:     0.8458
m  65 n  65   3.97e-04 MATLAB:     0.1289 AdotB :     0.4069 GB,auto::     0.1462(G) outer     0.1345 rel:     3.0242  speedup:     0.8816
m  65 n  70   4.27e-04 MATLAB:     0.1497 AdotB :     0.4240 GB,auto::     0.1527(G) outer     0.1366 rel:     3.1049  speedup:     0.9801
m  65 n  80   4.88e-04 MATLAB:     0.1592 AdotB :     0.4928 GB,auto::     0.1624(G) outer     0.1435 rel:     3.4344  speedup:     0.9805
m  65 n  90   5.49e-04 MATLAB:     0.1589 AdotB :     0.5657 GB,auto::     0.1789(G) outer     0.1445 rel:     3.9150  speedup:     0.8885
m  65 n 100   6.10e-04 MATLAB:     0.1619 AdotB :     0.6434 GB,auto::     0.1902(G) outer     0.1457 rel:     4.4168  speedup:     0.8513

m  70 n   1   6.99e-06 MATLAB:     0.1322 AdotB :     0.0067 GB,auto::     0.0065(d) outer     0.1195 rel:     0.0564  speedup:    20.4196
m  70 n  10   6.93e-05 MATLAB:     0.1397 AdotB :     0.0655 GB,auto::     0.0743(G) outer     0.1248 rel:     0.5250  speedup:     1.8799
m  70 n  20   1.37e-04 MATLAB:     0.1396 AdotB :     0.1350 GB,auto::     0.0915(G) outer     0.1307 rel:     1.0330  speedup:     1.5260
m  70 n  30   2.04e-04 MATLAB:     0.1428 AdotB :     0.2014 GB,auto::     0.1086(G) outer     0.1334 rel:     1.5096  speedup:     1.3155
m  70 n  40   2.69e-04 MATLAB:     0.1501 AdotB :     0.2732 GB,auto::     0.1129(G) outer     0.1344 rel:     2.0327  speedup:     1.3300
m  70 n  50   3.33e-04 MATLAB:     0.1577 AdotB :     0.3378 GB,auto::     0.1315(G) outer     0.1419 rel:     2.3800  speedup:     1.1992
m  70 n  60   3.96e-04 MATLAB:     0.1305 AdotB :     0.3946 GB,auto::     0.1517(G) outer     0.1427 rel:     2.7658  speedup:     0.8603
m  70 n  61   4.02e-04 MATLAB:     0.1360 AdotB :     0.4259 GB,auto::     0.1522(G) outer     0.1450 rel:     2.9372  speedup:     0.8938
m  70 n  62   4.09e-04 MATLAB:     0.1294 AdotB :     0.4144 GB,auto::     0.1600(G) outer     0.1445 rel:     2.8684  speedup:     0.8087
m  70 n  63   4.15e-04 MATLAB:     0.1293 AdotB :     0.4265 GB,auto::     0.1538(G) outer     0.1415 rel:     3.0130  speedup:     0.8405
m  70 n  64   4.21e-04 MATLAB:     0.1301 AdotB :     0.4202 GB,auto::     0.1678(G) outer     0.1461 rel:     2.8760  speedup:     0.7753
m  70 n  65   4.27e-04 MATLAB:     0.1429 AdotB :     0.4421 GB,auto::     0.1555(G) outer     0.1415 rel:     3.1235  speedup:     0.9191
m  70 n  70   4.58e-04 MATLAB:     0.1335 AdotB :     0.4686 GB,auto::     0.1798(G) outer     0.1484 rel:     3.1583  speedup:     0.7425
m  70 n  80   5.23e-04 MATLAB:     0.1621 AdotB :     0.5364 GB,auto::     0.1753(G) outer     0.1528 rel:     3.5100  speedup:     0.9249
m  70 n  90   5.89e-04 MATLAB:     0.1651 AdotB :     0.6081 GB,auto::     0.1887(G) outer     0.1504 rel:     4.0417  speedup:     0.8747
m  70 n 100   6.54e-04 MATLAB:     0.1750 AdotB :     0.6886 GB,auto::     0.2056(G) outer     0.1565 rel:     4.3994  speedup:     0.8512

m  80 n   1   7.99e-06 MATLAB:     0.1507 AdotB :     0.0079 GB,auto::     0.0075(d) outer     0.1339 rel:     0.0588  speedup:    20.0675
m  80 n  10   7.92e-05 MATLAB:     0.1506 AdotB :     0.0752 GB,auto::     0.0852(G) outer     0.1397 rel:     0.5387  speedup:     1.7673
m  80 n  20   1.57e-04 MATLAB:     0.1534 AdotB :     0.1538 GB,auto::     0.1014(G) outer     0.1404 rel:     1.0955  speedup:     1.5127
m  80 n  30   2.33e-04 MATLAB:     0.1581 AdotB :     0.2364 GB,auto::     0.1224(G) outer     0.1451 rel:     1.6297  speedup:     1.2923
m  80 n  40   3.08e-04 MATLAB:     0.1617 AdotB :     0.3200 GB,auto::     0.1405(G) outer     0.1459 rel:     2.1929  speedup:     1.1506
m  80 n  50   3.81e-04 MATLAB:     0.1648 AdotB :     0.3791 GB,auto::     0.1489(G) outer     0.1499 rel:     2.5285  speedup:     1.1067
m  80 n  60   4.53e-04 MATLAB:     0.1665 AdotB :     0.4653 GB,auto::     0.1732(G) outer     0.1571 rel:     2.9623  speedup:     0.9616
m  80 n  61   4.60e-04 MATLAB:     0.1444 AdotB :     0.4633 GB,auto::     0.1680(G) outer     0.1633 rel:     2.8377  speedup:     0.8592
m  80 n  62   4.67e-04 MATLAB:     0.1440 AdotB :     0.4903 GB,auto::     0.1775(G) outer     0.1575 rel:     3.1130  speedup:     0.8110
m  80 n  63   4.74e-04 MATLAB:     0.1432 AdotB :     0.4842 GB,auto::     0.1770(G) outer     0.1555 rel:     3.1133  speedup:     0.8087
m  80 n  64   4.81e-04 MATLAB:     0.1560 AdotB :     0.5066 GB,auto::     0.1778(G) outer     0.1569 rel:     3.2278  speedup:     0.8774
m  80 n  65   4.88e-04 MATLAB:     0.1448 AdotB :     0.5115 GB,auto::     0.1774(G) outer     0.1570 rel:     3.2579  speedup:     0.8164
m  80 n  70   5.23e-04 MATLAB:     0.1465 AdotB :     0.5346 GB,auto::     0.1924(G) outer     0.1570 rel:     3.4052  speedup:     0.7615
m  80 n  80   5.93e-04 MATLAB:     0.1505 AdotB :     0.6083 GB,auto::     0.2168(G) outer     0.1641 rel:     3.7080  speedup:     0.6941
m  80 n  90   6.67e-04 MATLAB:     0.1773 AdotB :     0.6972 GB,auto::     0.2273(G) outer     0.1645 rel:     4.2397  speedup:     0.7801
m  80 n 100   7.41e-04 MATLAB:     0.1805 AdotB :     0.7719 GB,auto::     0.2317(G) outer     0.1629 rel:     4.7374  speedup:     0.7789

m  90 n   1   8.99e-06 MATLAB:     0.1627 AdotB :     0.0085 GB,auto::     0.0083(d) outer     0.1448 rel:     0.0590  speedup:    19.5238
m  90 n  10   8.91e-05 MATLAB:     0.1726 AdotB :     0.0876 GB,auto::     0.1029(G) outer     0.1505 rel:     0.5819  speedup:     1.6779
m  90 n  20   1.76e-04 MATLAB:     0.1710 AdotB :     0.1738 GB,auto::     0.1282(G) outer     0.1511 rel:     1.1499  speedup:     1.3344
m  90 n  30   2.62e-04 MATLAB:     0.1683 AdotB :     0.2595 GB,auto::     0.1323(G) outer     0.1578 rel:     1.6442  speedup:     1.2720
m  90 n  40   3.46e-04 MATLAB:     0.1730 AdotB :     0.3563 GB,auto::     0.1616(G) outer     0.1599 rel:     2.2277  speedup:     1.0708
m  90 n  50   4.29e-04 MATLAB:     0.1760 AdotB :     0.4335 GB,auto::     0.1733(G) outer     0.1641 rel:     2.6411  speedup:     1.0154
m  90 n  60   5.09e-04 MATLAB:     0.1953 AdotB :     0.5153 GB,auto::     0.1925(G) outer     0.1691 rel:     3.0478  speedup:     1.0149
m  90 n  61   5.17e-04 MATLAB:     0.1782 AdotB :     0.5315 GB,auto::     0.1867(G) outer     0.1667 rel:     3.1883  speedup:     0.9544
m  90 n  62   5.25e-04 MATLAB:     0.1822 AdotB :     0.5378 GB,auto::     0.1891(G) outer     0.1686 rel:     3.1905  speedup:     0.9634
m  90 n  63   5.33e-04 MATLAB:     0.1839 AdotB :     0.5447 GB,auto::     0.2018(G) outer     0.1712 rel:     3.1815  speedup:     0.9112
m  90 n  64   5.41e-04 MATLAB:     0.1820 AdotB :     0.5543 GB,auto::     0.1912(G) outer     0.1713 rel:     3.2352  speedup:     0.9519
m  90 n  65   5.49e-04 MATLAB:     0.2101 AdotB :     0.5635 GB,auto::     0.1957(G) outer     0.1815 rel:     3.1049  speedup:     1.0738
m  90 n  70   5.89e-04 MATLAB:     0.1836 AdotB :     0.6033 GB,auto::     0.2006(G) outer     0.1726 rel:     3.4947  speedup:     0.9152
m  90 n  80   6.67e-04 MATLAB:     0.1887 AdotB :     0.6861 GB,auto::     0.2331(G) outer     0.1827 rel:     3.7555  speedup:     0.8097
m  90 n  90   7.43e-04 MATLAB:     0.2025 AdotB :     0.7864 GB,auto::     0.2578(G) outer     0.1762 rel:     4.4617  speedup:     0.7854
m  90 n 100   8.26e-04 MATLAB:     0.1995 AdotB :     0.8595 GB,auto::     0.2584(G) outer     0.1903 rel:     4.5166  speedup:     0.7721

m 100 n   1   9.99e-06 MATLAB:     0.1728 AdotB :     0.0095 GB,auto::     0.0093(d) outer     0.1586 rel:     0.0596  speedup:    18.5459
m 100 n  10   9.90e-05 MATLAB:     0.1781 AdotB :     0.0960 GB,auto::     0.1074(G) outer     0.1625 rel:     0.5906  speedup:     1.6580
m 100 n  20   1.96e-04 MATLAB:     0.1778 AdotB :     0.1904 GB,auto::     0.1440(G) outer     0.1734 rel:     1.0978  speedup:     1.2345
m 100 n  30   2.91e-04 MATLAB:     0.1841 AdotB :     0.2873 GB,auto::     0.1491(G) outer     0.1735 rel:     1.6563  speedup:     1.2346
m 100 n  40   3.85e-04 MATLAB:     0.1881 AdotB :     0.3847 GB,auto::     0.1882(G) outer     0.1750 rel:     2.1982  speedup:     0.9994
m 100 n  50   4.76e-04 MATLAB:     0.1976 AdotB :     0.4689 GB,auto::     0.1894(G) outer     0.1846 rel:     2.5404  speedup:     1.0430
m 100 n  60   5.66e-04 MATLAB:     0.2059 AdotB :     0.5880 GB,auto::     0.2067(G) outer     0.1786 rel:     3.2920  speedup:     0.9961
m 100 n  61   5.75e-04 MATLAB:     0.1950 AdotB :     0.5907 GB,auto::     0.2040(G) outer     0.1841 rel:     3.2083  speedup:     0.9557
m 100 n  62   5.84e-04 MATLAB:     0.1707 AdotB :     0.5997 GB,auto::     0.2057(G) outer     0.1831 rel:     3.2753  speedup:     0.8299
m 100 n  63   5.93e-04 MATLAB:     0.1976 AdotB :     0.5990 GB,auto::     0.2157(G) outer     0.1806 rel:     3.3163  speedup:     0.9159
m 100 n  64   6.02e-04 MATLAB:     0.1945 AdotB :     0.6232 GB,auto::     0.2072(G) outer     0.1819 rel:     3.4262  speedup:     0.9391
m 100 n  65   6.10e-04 MATLAB:     0.1936 AdotB :     0.6524 GB,auto::     0.2154(G) outer     0.1816 rel:     3.5929  speedup:     0.8989
m 100 n  70   6.54e-04 MATLAB:     0.1975 AdotB :     0.6754 GB,auto::     0.2222(G) outer     0.1889 rel:     3.5748  speedup:     0.8888
m 100 n  80   7.41e-04 MATLAB:     0.1753 AdotB :     0.7742 GB,auto::     0.2860(G) outer     0.1894 rel:     4.0884  speedup:     0.6130
m 100 n  90   8.26e-04 MATLAB:     0.1796 AdotB :     0.8673 GB,auto::     0.2654(G) outer     0.1904 rel:     4.5543  speedup:     0.6766
m 100 n 100   9.09e-04 MATLAB:     0.1836 AdotB :     0.9819 GB,auto::     0.3022(G) outer     0.1943 rel:     5.0527  speedup:     0.6075

test52: all tests passed

Prob = 

  struct with fields:

         A: [90009000 double]
      name: 'ND/nd3k'
     title: 'ND problem set, matrix nd3k'
        id: 936
      date: '2003'
    author: 'author unknown'
        ed: 'T. Davis'
      kind: '2D/3D problem'

MATLAB time: 3.44373 3.53317 3.49125 3.77936
with mask:
MATLAB time: 3.64097 3.59673 3.59377 3.82288
  1 [ first    min  logical] : auto speedups     1.1037(G)     1.1214(G)     1.1411(G)     1.1943(G) speedups     0.9273(G)     1.0237(d)     0.9068(G)     1.0855(d) 
  2 [ first    min     int8] : auto speedups     1.1357(G)     1.1709(G)     1.1647(G)     1.2268(G) speedups     0.9950(G)     0.8801(d)     0.9831(G)     0.9240(d) 
  3 [ first    min    uint8] : auto speedups     1.1680(G)     1.2163(G)     1.1725(G)     1.2756(G) speedups     0.8609(G)     0.8748(d)     0.8457(G)     0.9118(d) 
  4 [ first    min    int16] : auto speedups     1.1530(G)     1.1457(G)     1.1675(G)     1.2287(G) speedups     0.9731(G)     0.8764(d)     0.9418(G)     0.9179(d) 
  5 [ first    min   uint16] : auto speedups     1.1539(G)     1.1849(G)     1.1578(G)     1.1908(G) speedups     0.8759(G)     0.8917(d)     0.8657(G)     0.9333(d) 
  6 [ first    min    int32] : auto speedups     1.1696(G)     1.1836(G)     1.1686(G)     1.2475(G) speedups     0.9069(G)     0.8940(d)     0.8886(G)     0.9412(d) 
  7 [ first    min   uint32] : auto speedups     1.1514(G)     1.1670(G)     1.1811(G)     1.2440(G) speedups     0.8794(G)     0.8967(d)     0.8749(G)     0.9247(d) 
  8 [ first    min    int64] : auto speedups     1.0615(G)     1.0694(G)     1.0593(G)     1.0992(G) speedups     0.8841(G)     0.8916(d)     0.8675(G)     0.9385(d) 
  9 [ first    min   uint64] : auto speedups     1.0371(G)     1.0805(G)     1.0702(G)     1.1224(G) speedups     0.9754(G)     0.8286(d)     0.9572(G)     0.8808(d) 
 10 [ first    min   single] : auto speedups     1.0382(G)     1.0249(G)     0.9650(G)     1.0505(G) speedups     0.8744(G)     0.9805(d)     0.9193(G)     1.0596(d) 
 11 [ first    min   double] : auto speedups     0.9745(G)     0.9826(G)     0.9908(G)     1.0630(G) speedups     0.8618(G)     2.5099(d)     0.8627(G)     2.5680(d) 
 12 [ first    max  logical] : auto speedups     1.2481(G)     1.2454(G)     1.2633(G)     1.3177(G) speedups     0.8700(G)     1.0307(d)     0.8431(G)     1.0979(d) 
 13 [ first    max     int8] : auto speedups     1.1382(G)     1.1611(G)     1.1541(G)     1.2490(G) speedups     1.0334(G)     0.8670(d)     0.9663(G)     0.9261(d) 
 14 [ first    max    uint8] : auto speedups     1.0986(G)     1.1161(G)     1.0970(G)     1.1720(G) speedups     0.9419(G)     0.8698(d)     0.8917(G)     0.9168(d) 
 15 [ first    max    int16] : auto speedups     1.1788(G)     1.1727(G)     1.1735(G)     1.2488(G) speedups     0.8827(G)     0.8829(d)     0.8717(G)     0.9283(d) 
 16 [ first    max   uint16] : auto speedups     1.1207(G)     1.1268(G)     1.1284(G)     1.2102(G) speedups     0.9534(G)     0.8609(d)     0.9281(G)     0.9059(d) 
 17 [ first    max    int32] : auto speedups     1.1525(G)     1.1886(G)     1.1548(G)     1.2271(G) speedups     0.8945(G)     0.8907(d)     0.8813(G)     0.9325(d) 
 18 [ first    max   uint32] : auto speedups     1.1095(G)     1.1406(G)     1.1236(G)     1.2064(G) speedups     0.8804(G)     0.8738(d)     0.8599(G)     0.9219(d) 
 19 [ first    max    int64] : auto speedups     1.0360(G)     1.0752(G)     1.0542(G)     1.1396(G) speedups     0.9654(G)     0.8955(d)     0.9639(G)     0.9464(d) 
 20 [ first    max   uint64] : auto speedups     1.0054(G)     1.0374(G)     1.0408(G)     1.0877(G) speedups     0.9549(G)     0.8447(d)     0.9421(G)     0.8768(d) 
 21 [ first    max   single] : auto speedups     1.0528(G)     1.0641(G)     1.0656(G)     1.1270(G) speedups     0.9574(G)     1.0148(d)     0.9538(G)     1.0531(d) 
 22 [ first    max   double] : auto speedups     1.0076(G)     0.8851(G)     0.9596(G)     0.9648(G) speedups     0.9480(G)     2.3602(d)     0.9396(G)     2.6610(d) 
 23 [ first   plus  logical] : auto speedups     1.2323(G)     1.2804(G)     1.2368(G)     1.3196(G) speedups     0.8581(G)     1.0372(d)     0.8448(G)     1.1039(d) 
 24 [ first   plus     int8] : auto speedups     1.2263(G)     1.2658(G)     1.2445(G)     1.3128(G) speedups     0.8961(G)     0.9092(d)     0.8744(G)     0.9384(d) 
 25 [ first   plus    uint8] : auto speedups     1.2408(G)     1.2609(G)     1.2671(G)     1.3137(G) speedups     0.8603(G)     0.8966(d)     0.8513(G)     0.9447(d) 
 26 [ first   plus    int16] : auto speedups     1.2317(G)     1.2388(G)     1.2265(G)     1.3130(G) speedups     0.9789(G)     0.8845(d)     0.9438(G)     0.9475(d) 
 27 [ first   plus   uint16] : auto speedups     1.2392(G)     1.2441(G)     1.2216(G)     1.3317(G) speedups     0.9025(G)     0.8885(d)     0.8903(G)     0.9346(d) 
 28 [ first   plus    int32] : auto speedups     1.2126(G)     1.2132(G)     1.2207(G)     1.3084(G) speedups     0.8685(G)     0.8977(d)     0.8465(G)     0.9358(d) 
 29 [ first   plus   uint32] : auto speedups     1.2329(G)     1.2382(G)     1.2249(G)     1.3237(G) speedups     0.8974(G)     0.8848(d)     0.8714(G)     0.9414(d) 
 30 [ first   plus    int64] : auto speedups     1.0928(G)     1.1019(G)     1.1164(G)     1.1772(G) speedups     0.8758(G)     0.9157(d)     0.8527(G)     0.9319(d) 
 31 [ first   plus   uint64] : auto speedups     1.1142(G)     1.1222(G)     1.1086(G)     1.1636(G) speedups     0.8777(G)     0.8597(d)     0.8638(G)     0.8959(d) 
 32 [ first   plus   single] : auto speedups     1.2241(G)     1.2603(G)     1.2371(G)     1.3209(G) speedups     0.9184(G)     1.0392(d)     0.8878(G)     1.0974(d) 
 33 [ first   plus   double] : auto speedups     1.1435(G)     1.1401(G)     1.1792(G)     1.2292(G) speedups     1.0088(G)     2.6816(d)     0.9917(G)     2.8132(d) 
 34 [ first  times  logical] : auto speedups     1.1350(G)     1.1493(G)     1.1402(G)     1.2193(G) speedups     0.9237(G)     1.0394(d)     0.9027(G)     1.0753(d) 
 35 [ first  times     int8] : auto speedups     0.9844(G)     1.0206(G)     1.0022(G)     1.0735(G) speedups     0.9204(G)     0.8945(d)     0.8791(G)     0.9359(d) 
 36 [ first  times    uint8] : auto speedups     0.9816(G)     1.0196(G)     0.9901(G)     1.0688(G) speedups     0.9181(G)     0.8958(d)     0.8956(G)     0.9340(d) 
 37 [ first  times    int16] : auto speedups     0.9868(G)     1.0137(G)     0.9946(G)     1.0748(G) speedups     0.9127(G)     0.8846(d)     0.8904(G)     0.9315(d) 
 38 [ first  times   uint16] : auto speedups     0.9778(G)     0.9848(G)     1.0012(G)     1.0794(G) speedups     0.9213(G)     0.8934(d)     0.8960(G)     0.9465(d) 
 39 [ first  times    int32] : auto speedups     1.2220(G)     1.2381(G)     1.2324(G)     1.3111(G) speedups     0.9179(G)     0.9008(d)     0.9148(G)     0.9352(d) 
 40 [ first  times   uint32] : auto speedups     1.2331(G)     1.2450(G)     1.2469(G)     1.2986(G) speedups     0.9093(G)     0.9150(d)     0.9154(G)     0.9458(d) 
 41 [ first  times    int64] : auto speedups     1.0967(G)     1.1226(G)     1.0949(G)     1.1933(G) speedups     0.8858(G)     0.9034(d)     0.8896(G)     0.9473(d) 
 42 [ first  times   uint64] : auto speedups     1.0794(G)     1.1101(G)     1.0909(G)     1.1829(G) speedups     0.9061(G)     0.8594(d)     0.8961(G)     0.9141(d) 
 43 [ first  times   single] : auto speedups     0.7350(G)     0.7381(G)     0.7435(G)     0.7994(G) speedups     0.8602(G)     0.9860(d)     0.8508(G)     1.0161(d) 
 44 [ first  times   double] : auto speedups     0.7677(G)     0.7830(G)     0.7783(G)     0.8435(G) speedups     0.9036(G)     2.0983(d)     0.8512(G)     2.1889(d) 
 45 [ first     or  logical] : auto speedups     1.2381(G)     1.2693(G)     1.2187(G)     1.3331(G) speedups     0.8678(G)     1.0435(d)     0.8430(G)     1.0940(d) 
 46 [ first    and  logical] : auto speedups     1.1195(G)     1.1632(G)     1.1278(G)     1.2267(G) speedups     0.9315(G)     1.0339(d)     0.8919(G)     1.0805(d) 
 47 [ first    xor  logical] : auto speedups     1.2112(G)     1.2581(G)     1.2343(G)     1.3151(G) speedups     0.8964(G)     1.0271(d)     0.8856(G)     1.0855(d) 
 48 [ first     eq  logical] : auto speedups     1.1131(G)     1.1526(G)     1.1315(G)     1.2287(G) speedups     0.9129(G)     1.0374(d)     0.8863(G)     1.0996(d) 
 49 [second    min  logical] : auto speedups     1.1990(G)     1.2219(G)     1.2141(G)     1.2824(G) speedups     0.9937(G)     1.0184(d)     0.9852(G)     1.0668(d) 
 50 [second    min     int8] : auto speedups     1.2499(G)     1.2712(G)     1.2614(G)     1.3135(G) speedups     1.0230(G)     0.8994(d)     0.9995(G)     0.9303(d) 
 51 [second    min    uint8] : auto speedups     1.2749(G)     1.3138(G)     1.2782(G)     1.3965(G) speedups     0.9227(G)     0.8612(d)     0.9031(G)     0.9223(d) 
 52 [second    min    int16] : auto speedups     1.2095(G)     1.2323(G)     1.1737(G)     1.2385(G) speedups     0.8467(G)     0.8425(d)     0.8383(G)     0.8764(d) 
 53 [second    min   uint16] : auto speedups     1.0831(G)     1.1043(G)     1.1132(G)     1.2393(G) speedups     0.8770(G)     0.8798(d)     0.8589(G)     0.9337(d) 
 54 [second    min    int32] : auto speedups     1.2314(G)     1.2496(G)     1.2423(G)     1.3055(G) speedups     0.9027(G)     0.8987(d)     0.8982(G)     0.9316(d) 
 55 [second    min   uint32] : auto speedups     1.2327(G)     1.2379(G)     1.2402(G)     1.3006(G) speedups     0.9252(G)     0.8997(d)     0.8868(G)     0.9334(d) 
 56 [second    min    int64] : auto speedups     1.1759(G)     1.1921(G)     1.1826(G)     1.2436(G) speedups     0.8822(G)     0.9023(d)     0.8472(G)     0.9383(d) 
 57 [second    min   uint64] : auto speedups     1.1404(G)     1.1988(G)     1.1626(G)     1.2620(G) speedups     0.8801(G)     0.8466(d)     0.8595(G)     0.8634(d) 
 58 [second    min   single] : auto speedups     1.0954(G)     1.1139(G)     1.0952(G)     1.1768(G) speedups     0.9363(G)     1.0278(d)     0.9059(G)     1.0452(d) 
 59 [second    min   double] : auto speedups     1.0855(G)     1.1138(G)     1.0893(G)     1.1671(G) speedups     0.9268(G)     2.6871(d)     0.9082(G)     2.8072(d) 
 60 [second    max  logical] : auto speedups     1.3137(G)     1.3559(G)     1.2945(G)     1.3850(G) speedups     0.8993(G)     1.0442(d)     0.8763(G)     1.0816(d) 
 61 [second    max     int8] : auto speedups     1.2391(G)     1.2530(G)     1.2459(G)     1.3096(G) speedups     1.0851(G)     0.8895(d)     1.0500(G)     0.9267(d) 
 62 [second    max    uint8] : auto speedups     1.1616(G)     1.1894(G)     1.1834(G)     1.2563(G) speedups     0.9849(G)     0.8628(d)     0.9630(G)     0.9129(d) 
 63 [second    max    int16] : auto speedups     1.2239(G)     1.2368(G)     1.2195(G)     1.3149(G) speedups     0.8806(G)     0.8697(d)     0.8531(G)     0.9242(d) 
 64 [second    max   uint16] : auto speedups     1.1618(G)     1.1903(G)     1.1781(G)     1.2138(G) speedups     0.8509(G)     0.8221(d)     0.8248(G)     0.8696(d) 
 65 [second    max    int32] : auto speedups     1.1910(G)     1.1953(G)     1.1204(G)     1.3052(G) speedups     0.9025(G)     0.8547(d)     0.8693(G)     0.9059(d) 
 66 [second    max   uint32] : auto speedups     1.1473(G)     1.1419(G)     1.1602(G)     1.2363(G) speedups     0.8987(G)     0.8239(d)     0.8767(G)     0.8607(d) 
 67 [second    max    int64] : auto speedups     1.1641(G)     1.1601(G)     1.1523(G)     1.2314(G) speedups     0.8476(G)     0.8598(d)     0.8311(G)     0.9067(d) 
 68 [second    max   uint64] : auto speedups     1.0675(G)     1.0995(G)     1.0953(G)     1.1337(G) speedups     0.8819(G)     0.8152(d)     0.8656(G)     0.8555(d) 
 69 [second    max   single] : auto speedups     1.0552(G)     1.0985(G)     1.0561(G)     1.1111(G) speedups     0.9271(G)     1.0333(d)     0.9243(G)     1.0753(d) 
 70 [second    max   double] : auto speedups     1.0956(G)     1.0782(G)     1.0773(G)     1.1569(G) speedups     0.9403(G)     2.6636(d)     0.9237(G)     2.7242(d) 
 71 [second   plus  logical] : auto speedups     1.3246(G)     1.3317(G)     1.3160(G)     1.3992(G) speedups     0.9097(G)     1.0273(d)     0.8801(G)     1.0705(d) 
 72 [second   plus     int8] : auto speedups     1.3048(G)     1.3148(G)     1.3260(G)     1.4122(G) speedups     0.8679(G)     0.8943(d)     0.8542(G)     0.9537(d) 
 73 [second   plus    uint8] : auto speedups     1.3074(G)     1.3463(G)     1.2816(G)     1.4319(G) speedups     0.9119(G)     0.8831(d)     0.8985(G)     0.9554(d) 
 74 [second   plus    int16] : auto speedups     1.2485(G)     1.3393(G)     1.2704(G)     1.4206(G) speedups     0.8941(G)     0.8781(d)     0.8785(G)     0.9321(d) 
 75 [second   plus   uint16] : auto speedups     1.3037(G)     1.3354(G)     1.2856(G)     1.4112(G) speedups     0.8803(G)     0.8938(d)     0.8589(G)     0.9367(d) 
 76 [second   plus    int32] : auto speedups     1.2874(G)     1.2604(G)     1.3274(G)     1.3988(G) speedups     0.8819(G)     0.9063(d)     0.8539(G)     0.9497(d) 
 77 [second   plus   uint32] : auto speedups     1.2924(G)     1.3276(G)     1.3241(G)     1.4094(G) speedups     0.8993(G)     0.8855(d)     0.8827(G)     0.9566(d) 
 78 [second   plus    int64] : auto speedups     1.2391(G)     1.2532(G)     1.1994(G)     1.3399(G) speedups     0.8861(G)     0.9063(d)     0.8558(G)     0.9578(d) 
 79 [second   plus   uint64] : auto speedups     1.2552(G)     1.2763(G)     1.2601(G)     1.3244(G) speedups     0.9253(G)     0.8616(d)     0.8956(G)     0.8991(d) 
 80 [second   plus   single] : auto speedups     1.3087(G)     1.3413(G)     1.3358(G)     1.4201(G) speedups     0.9365(G)     1.0445(d)     0.9233(G)     1.0898(d) 
 81 [second   plus   double] : auto speedups     1.3142(G)     1.3347(G)     1.3104(G)     1.3966(G) speedups     0.9451(G)     2.9188(d)     0.9161(G)     2.9206(d) 
 82 [second  times  logical] : auto speedups     1.2055(G)     1.2333(G)     1.2222(G)     1.3023(G) speedups     1.0073(G)     1.0212(d)     0.9880(G)     1.0581(d) 
 83 [second  times     int8] : auto speedups     1.0012(G)     1.0269(G)     1.0096(G)     1.0895(G) speedups     0.8759(G)     0.8473(d)     0.8544(G)     0.9422(d) 
 84 [second  times    uint8] : auto speedups     1.0105(G)     1.0249(G)     1.0110(G)     1.0809(G) speedups     0.8752(G)     0.8939(d)     0.8604(G)     0.9442(d) 
 85 [second  times    int16] : auto speedups     0.9968(G)     1.0258(G)     1.0162(G)     1.0750(G) speedups     0.8804(G)     0.8849(d)     0.8771(G)     0.9156(d) 
 86 [second  times   uint16] : auto speedups     0.9953(G)     1.0139(G)     1.0071(G)     1.0846(G) speedups     0.8842(G)     0.9021(d)     0.8690(G)     0.9363(d) 
 87 [second  times    int32] : auto speedups     1.2650(G)     1.3524(G)     1.3003(G)     1.3935(G) speedups     0.9395(G)     0.9100(d)     0.9149(G)     0.9383(d) 
 88 [second  times   uint32] : auto speedups     1.2912(G)     1.2930(G)     1.3182(G)     1.3690(G) speedups     0.9219(G)     0.9079(d)     0.9199(G)     0.9507(d) 
 89 [second  times    int64] : auto speedups     1.2180(G)     1.2671(G)     1.2685(G)     1.3232(G) speedups     0.8995(G)     0.9052(d)     0.8840(G)     0.9528(d) 
 90 [second  times   uint64] : auto speedups     1.2452(G)     1.2630(G)     1.2540(G)     1.3167(G) speedups     0.8959(G)     0.8621(d)     0.8786(G)     0.9073(d) 
 91 [second  times   single] : auto speedups     0.7535(G)     0.7639(G)     0.7620(G)     0.8175(G) speedups     0.8435(G)     0.9798(d)     0.8249(G)     1.0092(d) 
 92 [second  times   double] : auto speedups     0.8498(G)     0.8588(G)     0.8604(G)     0.9144(G) speedups     0.8098(G)     2.1625(d)     0.7926(G)     2.2181(d) 
 93 [second     or  logical] : auto speedups     1.2976(G)     1.3311(G)     1.3440(G)     1.4030(G) speedups     0.8937(G)     1.0429(d)     0.8859(G)     1.0934(d) 
 94 [second    and  logical] : auto speedups     1.2083(G)     1.2283(G)     1.2145(G)     1.3007(G) speedups     1.0020(G)     1.0229(d)     0.9789(G)     1.0619(d) 
 95 [second    xor  logical] : auto speedups     1.2736(G)     1.2967(G)     1.2842(G)     1.3947(G) speedups     0.8641(G)     1.0462(d)     0.8633(G)     1.1037(d) 
 96 [second     eq  logical] : auto speedups     1.2805(G)     1.3462(G)     1.3251(G)     1.4130(G) speedups     0.8747(G)     1.0145(d)     0.8597(G)     1.0873(d) 
 97 [   min    min  logical] : auto speedups     1.0619(G)     1.0779(G)     1.0517(G)     1.1556(G) speedups     1.0244(G)     1.0202(d)     1.0034(G)     1.0435(d) 
 98 [   min    min     int8] : auto speedups     0.9918(G)     1.0080(G)     0.9912(G)     1.0470(G) speedups     0.9218(G)     0.8321(d)     0.8817(G)     0.8799(d) 
 99 [   min    min    uint8] : auto speedups     1.0933(G)     1.0605(G)     1.0841(G)     1.1653(G) speedups     0.8783(G)     0.8686(d)     0.8604(G)     0.8964(d) 
100 [   min    min    int16] : auto speedups     1.0325(G)     1.0427(G)     1.0226(G)     1.1182(G) speedups     0.9388(G)     0.8667(d)     0.9277(G)     0.9188(d) 
101 [   min    min   uint16] : auto speedups     1.0334(G)     1.0668(G)     1.0256(G)     1.1160(G) speedups     0.9550(G)     0.8803(d)     0.9109(G)     0.9254(d) 
102 [   min    min    int32] : auto speedups     1.0641(G)     1.0874(G)     1.0725(G)     1.1468(G) speedups     0.9843(G)     0.8693(d)     0.9507(G)     0.9106(d) 
103 [   min    min   uint32] : auto speedups     1.0481(G)     1.0876(G)     1.0632(G)     1.1368(G) speedups     0.8466(G)     0.8797(d)     0.8473(G)     0.9294(d) 
104 [   min    min    int64] : auto speedups     0.9634(G)     1.0014(G)     0.9673(G)     1.0435(G) speedups     0.9617(G)     0.8705(d)     0.9293(G)     0.9210(d) 
105 [   min    min   uint64] : auto speedups     0.9491(G)     0.9792(G)     0.9791(G)     1.0521(G) speedups     0.8791(G)     0.8065(d)     0.8605(G)     0.8701(d) 
106 [   min    min   single] : auto speedups     0.7968(G)     0.8203(G)     0.8056(G)     0.8553(G) speedups     0.9651(G)     0.9966(d)     0.9499(G)     1.0555(d) 
107 [   min    min   double] : auto speedups     0.7723(G)     0.7701(G)     0.7784(G)     0.8350(G) speedups     0.9242(G)     2.2496(d)     0.8959(G)     2.3282(d) 
108 [   min    max  logical] : auto speedups     1.0530(G)     1.0760(G)     1.0519(G)     1.1427(G) speedups     1.0717(G)     1.0378(d)     1.0475(G)     1.0612(d) 
109 [   min    max     int8] : auto speedups     1.0344(G)     1.0237(G)     1.0496(G)     1.1054(G) speedups     0.9255(G)     0.8271(d)     0.8988(G)     0.8832(d) 
110 [   min    max    uint8] : auto speedups     1.0759(G)     1.0988(G)     1.0683(G)     1.1671(G) speedups     0.8774(G)     0.8557(d)     0.8528(G)     0.8970(d) 
111 [   min    max    int16] : auto speedups     1.0129(G)     1.0268(G)     1.0199(G)     1.1081(G) speedups     0.9243(G)     0.8762(d)     0.8996(G)     0.9047(d) 
112 [   min    max   uint16] : auto speedups     0.9736(G)     1.0082(G)     0.9807(G)     1.0738(G) speedups     0.9474(G)     0.8681(d)     0.9241(G)     0.9131(d) 
113 [   min    max    int32] : auto speedups     1.0590(G)     1.0742(G)     1.0792(G)     1.1314(G) speedups     0.9842(G)     0.8761(d)     0.9746(G)     0.9148(d) 
114 [   min    max   uint32] : auto speedups     1.0252(G)     1.0379(G)     1.0260(G)     1.1039(G) speedups     0.9657(G)     0.8736(d)     0.9239(G)     0.9066(d) 
115 [   min    max    int64] : auto speedups     0.9642(G)     0.9771(G)     0.9927(G)     1.0414(G) speedups     0.9764(G)     0.8755(d)     0.9476(G)     0.9141(d) 
116 [   min    max   uint64] : auto speedups     0.9435(G)     0.9514(G)     0.9503(G)     0.9987(G) speedups     0.9456(G)     0.8217(d)     0.9008(G)     0.8532(d) 
117 [   min    max   single] : auto speedups     0.8036(G)     0.8104(G)     0.8092(G)     0.8668(G) speedups     0.9530(G)     0.9933(d)     0.9370(G)     1.0634(d) 
118 [   min    max   double] : auto speedups     0.7714(G)     0.7919(G)     0.7743(G)     0.8299(G) speedups     0.8432(G)     2.2328(d)     0.8317(G)     2.3028(d) 
119 [   min   plus  logical] : auto speedups     1.0341(G)     1.0762(G)     1.0644(G)     1.1155(G) speedups     1.0807(G)     1.0319(d)     1.0467(G)     1.0956(d) 
120 [   min   plus     int8] : auto speedups     1.1593(G)     1.1873(G)     1.1566(G)     1.2536(G) speedups     0.9392(G)     0.8675(d)     0.9120(G)     0.8967(d) 
121 [   min   plus    uint8] : auto speedups     1.1454(G)     1.1855(G)     1.1795(G)     1.2639(G) speedups     0.8802(G)     0.8656(d)     0.8673(G)     0.9158(d) 
122 [   min   plus    int16] : auto speedups     1.1326(G)     1.1620(G)     1.1450(G)     1.2256(G) speedups     1.0158(G)     0.8827(d)     0.9831(G)     0.9394(d) 
123 [   min   plus   uint16] : auto speedups     1.1228(G)     1.1516(G)     1.1515(G)     1.2264(G) speedups     0.9454(G)     0.8922(d)     0.9351(G)     0.9479(d) 
124 [   min   plus    int32] : auto speedups     1.1402(G)     1.1708(G)     1.1583(G)     1.2335(G) speedups     0.8716(G)     0.8988(d)     0.8535(G)     0.9386(d) 
125 [   min   plus   uint32] : auto speedups     1.1310(G)     1.1590(G)     1.1450(G)     1.2330(G) speedups     0.9318(G)     0.8919(d)     0.9195(G)     0.9411(d) 
126 [   min   plus    int64] : auto speedups     1.0401(G)     1.0341(G)     1.0504(G)     1.0934(G) speedups     0.8658(G)     0.8990(d)     0.8593(G)     0.9350(d) 
127 [   min   plus   uint64] : auto speedups     1.0467(G)     1.0464(G)     1.0287(G)     1.1089(G) speedups     0.8745(G)     0.8305(d)     0.8449(G)     0.8700(d) 
128 [   min   plus   single] : auto speedups     1.0399(G)     1.0275(G)     1.0488(G)     1.1343(G) speedups     0.9840(G)     0.9927(d)     0.9755(G)     1.0561(d) 
129 [   min   plus   double] : auto speedups     1.0048(G)     1.0131(G)     1.0115(G)     1.0756(G) speedups     0.9219(G)     2.3021(d)     0.9327(G)     2.3699(d) 
130 [   min  times  logical] : auto speedups     1.0602(G)     1.0642(G)     1.0536(G)     1.1562(G) speedups     1.0196(G)     1.0251(d)     0.9884(G)     1.0712(d) 
131 [   min  times     int8] : auto speedups     1.1314(G)     1.1432(G)     1.1390(G)     1.2134(G) speedups     1.0123(G)     0.8521(d)     0.9764(G)     0.9047(d) 
132 [   min  times    uint8] : auto speedups     1.1829(G)     1.1917(G)     1.1855(G)     1.2416(G) speedups     0.9170(G)     0.8668(d)     0.8917(G)     0.9126(d) 
133 [   min  times    int16] : auto speedups     1.1106(G)     1.1692(G)     1.1513(G)     1.2295(G) speedups     0.9489(G)     0.8828(d)     0.9327(G)     0.9189(d) 
134 [   min  times   uint16] : auto speedups     1.1222(G)     1.1592(G)     1.1283(G)     1.2329(G) speedups     0.9509(G)     0.8824(d)     0.9364(G)     0.9385(d) 
135 [   min  times    int32] : auto speedups     1.1518(G)     1.1778(G)     1.1637(G)     1.2275(G) speedups     0.9897(G)     0.8873(d)     0.9577(G)     0.9355(d) 
136 [   min  times   uint32] : auto speedups     1.1488(G)     1.1878(G)     1.1500(G)     1.1874(G) speedups     1.0017(G)     0.8920(d)     0.9792(G)     0.9409(d) 
137 [   min  times    int64] : auto speedups     1.0305(G)     1.0646(G)     1.0547(G)     1.1163(G) speedups     0.8520(G)     0.8227(d)     0.8272(G)     0.9252(d) 
138 [   min  times   uint64] : auto speedups     1.0071(G)     1.0662(G)     1.0329(G)     1.1273(G) speedups     0.8956(G)     0.8229(d)     0.8854(G)     0.8546(d) 
139 [   min  times   single] : auto speedups     0.6450(G)     0.6587(G)     0.6492(G)     0.6986(G) speedups     0.9476(G)     0.9359(d)     0.9406(G)     0.9707(d) 
140 [   min  times   double] : auto speedups     0.6902(G)     0.6956(G)     0.6992(G)     0.7390(G) speedups     0.7962(G)     1.7334(d)     0.7722(G)     1.8635(d) 
141 [   min     or  logical] : auto speedups     1.0560(G)     1.0821(G)     1.0706(G)     1.1252(G) speedups     1.0625(G)     1.0261(d)     1.0470(G)     1.0941(d) 
142 [   min    and  logical] : auto speedups     1.0600(G)     1.0825(G)     1.0659(G)     1.1278(G) speedups     0.9813(G)     0.9558(d)     0.9552(G)     0.9692(d) 
143 [   min    xor  logical] : auto speedups     1.0861(G)     1.1164(G)     1.1350(G)     1.2121(G) speedups     1.0267(G)     1.0132(d)     1.0032(G)     1.0877(d) 
144 [   min     eq  logical] : auto speedups     1.1467(G)     1.1790(G)     1.1523(G)     1.2391(G) speedups     1.1234(G)     1.0128(d)     1.0749(G)     1.0702(d) 
145 [   max    min  logical] : auto speedups     1.0940(G)     1.1062(G)     1.0829(G)     1.1479(G) speedups     0.8630(G)     0.9930(d)     0.8533(G)     1.0466(d) 
146 [   max    min     int8] : auto speedups     0.9802(G)     1.0243(G)     1.0162(G)     1.0819(G) speedups     0.8845(G)     0.8442(d)     0.8646(G)     0.8798(d) 
147 [   max    min    uint8] : auto speedups     1.0620(G)     1.0955(G)     1.0778(G)     1.1272(G) speedups     0.9437(G)     0.8400(d)     0.9214(G)     0.8916(d) 
148 [   max    min    int16] : auto speedups     1.0318(G)     1.0526(G)     1.0294(G)     1.1067(G) speedups     0.9816(G)     0.8764(d)     0.9604(G)     0.8979(d) 
149 [   max    min   uint16] : auto speedups     0.9752(G)     1.0200(G)     0.9773(G)     1.0702(G) speedups     0.9744(G)     0.8620(d)     0.9451(G)     0.8691(d) 
150 [   max    min    int32] : auto speedups     1.0275(G)     1.0650(G)     1.0597(G)     1.1449(G) speedups     0.9856(G)     0.8678(d)     0.9728(G)     0.9227(d) 
151 [   max    min   uint32] : auto speedups     1.0300(G)     1.0483(G)     1.0274(G)     1.0774(G) speedups     0.9876(G)     0.8628(d)     0.9740(G)     0.8812(d) 
152 [   max    min    int64] : auto speedups     0.9707(G)     1.0030(G)     0.9570(G)     1.0379(G) speedups     0.9699(G)     0.8795(d)     0.9338(G)     0.9143(d) 
153 [   max    min   uint64] : auto speedups     0.9410(G)     0.9492(G)     0.9271(G)     1.0165(G) speedups     0.9499(G)     0.7997(d)     0.9184(G)     0.8603(d) 
154 [   max    min   single] : auto speedups     0.7940(G)     0.8113(G)     0.8053(G)     0.8613(G) speedups     0.9401(G)     0.9918(d)     0.9441(G)     1.0660(d) 
155 [   max    min   double] : auto speedups     0.7747(G)     0.7852(G)     0.7826(G)     0.8352(G) speedups     0.9352(G)     2.2530(d)     0.9158(G)     2.3685(d) 
156 [   max    max  logical] : auto speedups     1.1744(G)     1.2165(G)     1.2022(G)     1.3092(G) speedups     0.9046(G)     1.0223(d)     0.8971(G)     1.0852(d) 
157 [   max    max     int8] : auto speedups     0.9792(G)     1.0087(G)     0.9867(G)     1.0729(G) speedups     0.9074(G)     0.8259(d)     0.8883(G)     0.8702(d) 
158 [   max    max    uint8] : auto speedups     0.9361(G)     0.9615(G)     0.9384(G)     0.9934(G) speedups     0.9364(G)     0.8272(d)     0.9207(G)     0.8656(d) 
159 [   max    max    int16] : auto speedups     1.0364(G)     1.0497(G)     1.0270(G)     1.1010(G) speedups     0.9389(G)     0.8666(d)     0.9290(G)     0.9201(d) 
160 [   max    max   uint16] : auto speedups     0.9376(G)     0.9446(G)     0.9495(G)     1.0196(G) speedups     0.9031(G)     0.8543(d)     0.8986(G)     0.9214(d) 
161 [   max    max    int32] : auto speedups     1.0675(G)     1.0567(G)     1.0828(G)     1.1436(G) speedups     0.9665(G)     0.8819(d)     0.9419(G)     0.9147(d) 
162 [   max    max   uint32] : auto speedups     0.9738(G)     0.9879(G)     0.9952(G)     1.0708(G) speedups     0.9603(G)     0.8689(d)     0.9410(G)     0.9142(d) 
163 [   max    max    int64] : auto speedups     0.9709(G)     0.9963(G)     0.9844(G)     1.0647(G) speedups     0.9757(G)     0.8697(d)     0.9299(G)     0.9180(d) 
164 [   max    max   uint64] : auto speedups     0.9076(G)     0.9270(G)     0.9105(G)     0.9535(G) speedups     0.8581(G)     0.8114(d)     0.8433(G)     0.8417(d) 
165 [   max    max   single] : auto speedups     0.7947(G)     0.8157(G)     0.8003(G)     0.8672(G) speedups     0.9475(G)     1.0126(d)     0.9419(G)     0.9892(d) 
166 [   max    max   double] : auto speedups     0.7687(G)     0.7895(G)     0.7835(G)     0.8271(G) speedups     0.9145(G)     2.2407(d)     0.9142(G)     2.2910(d) 
167 [   max   plus  logical] : auto speedups     1.1947(G)     1.2089(G)     1.2189(G)     1.2988(G) speedups     0.9117(G)     1.0393(d)     0.8904(G)     1.0678(d) 
168 [   max   plus     int8] : auto speedups     1.1486(G)     1.1763(G)     1.1467(G)     1.2352(G) speedups     0.9290(G)     0.8534(d)     0.9057(G)     0.8771(d) 
169 [   max   plus    uint8] : auto speedups     1.1246(G)     1.1880(G)     1.1568(G)     1.2438(G) speedups     0.8709(G)     0.8682(d)     0.8677(G)     0.8987(d) 
170 [   max   plus    int16] : auto speedups     1.1242(G)     1.1604(G)     1.1271(G)     1.2203(G) speedups     1.0135(G)     0.8811(d)     0.9956(G)     0.9199(d) 
171 [   max   plus   uint16] : auto speedups     1.0988(G)     1.1237(G)     1.1022(G)     1.1814(G) speedups     0.9513(G)     0.8931(d)     0.9227(G)     0.9369(d) 
172 [   max   plus    int32] : auto speedups     1.1416(G)     1.1601(G)     1.1598(G)     1.2451(G) speedups     0.8823(G)     0.8807(d)     0.8616(G)     0.9303(d) 
173 [   max   plus   uint32] : auto speedups     1.1220(G)     1.1052(G)     1.1338(G)     1.1846(G) speedups     0.9313(G)     0.8851(d)     0.9138(G)     0.9215(d) 
174 [   max   plus    int64] : auto speedups     1.0348(G)     1.0554(G)     1.0483(G)     1.1092(G) speedups     0.8735(G)     0.8957(d)     0.8581(G)     0.9223(d) 
175 [   max   plus   uint64] : auto speedups     0.9817(G)     1.0162(G)     1.0010(G)     1.0547(G) speedups     0.8729(G)     0.8002(d)     0.8469(G)     0.8370(d) 
176 [   max   plus   single] : auto speedups     1.0477(G)     1.0514(G)     1.0300(G)     1.1197(G) speedups     0.9837(G)     1.0078(d)     0.9797(G)     1.0523(d) 
177 [   max   plus   double] : auto speedups     1.0121(G)     1.0008(G)     1.0149(G)     1.0724(G) speedups     0.9471(G)     2.3349(d)     0.9167(G)     2.4265(d) 
178 [   max  times  logical] : auto speedups     1.0725(G)     1.0757(G)     1.0601(G)     1.1291(G) speedups     0.8553(G)     0.9611(d)     0.8320(G)     1.0182(d) 
179 [   max  times     int8] : auto speedups     1.1208(G)     1.1382(G)     1.1435(G)     1.1263(G) speedups     0.9215(G)     0.8107(d)     0.9166(G)     0.8016(d) 
180 [   max  times    uint8] : auto speedups     1.1394(G)     1.1220(G)     1.1447(G)     1.2301(G) speedups     0.9369(G)     0.8264(d)     0.9473(G)     0.9262(d) 
181 [   max  times    int16] : auto speedups     1.1306(G)     1.1454(G)     1.1190(G)     1.2250(G) speedups     0.8757(G)     0.8813(d)     0.8663(G)     0.9300(d) 
182 [   max  times   uint16] : auto speedups     1.1022(G)     1.1272(G)     1.1266(G)     1.1857(G) speedups     0.8862(G)     0.8920(d)     0.8599(G)     0.9178(d) 
183 [   max  times    int32] : auto speedups     1.1491(G)     1.1609(G)     1.1615(G)     1.2274(G) speedups     0.9080(G)     0.8980(d)     0.8985(G)     0.9313(d) 
184 [   max  times   uint32] : auto speedups     1.0951(G)     1.0990(G)     1.0927(G)     1.1755(G) speedups     1.0002(G)     0.8661(d)     0.9811(G)     0.9240(d) 
185 [   max  times    int64] : auto speedups     1.0388(G)     1.0439(G)     1.0545(G)     1.1448(G) speedups     0.9809(G)     0.8791(d)     0.9545(G)     0.9305(d) 
186 [   max  times   uint64] : auto speedups     1.0001(G)     1.0288(G)     1.0194(G)     1.0927(G) speedups     0.9742(G)     0.8077(d)     0.9541(G)     0.8465(d) 
187 [   max  times   single] : auto speedups     0.6504(G)     0.6596(G)     0.6561(G)     0.7059(G) speedups     0.8591(G)     0.9313(d)     0.8417(G)     0.9956(d) 
188 [   max  times   double] : auto speedups     0.6912(G)     0.7128(G)     0.6992(G)     0.7538(G) speedups     0.8149(G)     1.8175(d)     0.8007(G)     1.8570(d) 
189 [   max     or  logical] : auto speedups     1.1628(G)     1.2187(G)     1.1727(G)     1.2897(G) speedups     0.9060(G)     1.0287(d)     0.8948(G)     1.0842(d) 
190 [   max    and  logical] : auto speedups     1.0660(G)     1.1024(G)     1.0570(G)     1.1595(G) speedups     0.8163(G)     0.9753(d)     0.8520(G)     1.0431(d) 
191 [   max    xor  logical] : auto speedups     1.1638(G)     1.1895(G)     1.1774(G)     1.2874(G) speedups     0.8979(G)     1.0442(d)     0.8915(G)     1.0841(d) 
192 [   max     eq  logical] : auto speedups     0.9944(G)     1.0217(G)     0.9852(G)     1.0846(G) speedups     1.0190(G)     1.0209(d)     0.9715(G)     1.0746(d) 
193 [  plus    min  logical] : auto speedups     1.0212(G)     1.0743(G)     1.0831(G)     1.1634(G) speedups     0.8689(G)     1.0079(d)     0.8498(G)     1.0609(d) 
194 [  plus    min     int8] : auto speedups     1.1192(G)     1.1477(G)     1.1069(G)     1.2099(G) speedups     1.0354(G)     0.8698(d)     0.9937(G)     0.8893(d) 
195 [  plus    min    uint8] : auto speedups     1.0489(G)     1.0899(G)     1.0568(G)     1.1192(G) speedups     0.8470(G)     0.7952(d)     0.7911(G)     0.8358(d) 
196 [  plus    min    int16] : auto speedups     0.7606(G)     0.7940(G)     0.7935(G)     0.8355(G) speedups     0.8335(G)     0.8529(d)     0.8250(G)     0.8855(d) 
197 [  plus    min   uint16] : auto speedups     0.7912(G)     0.8075(G)     0.7739(G)     0.8442(G) speedups     0.9360(G)     0.8323(d)     0.9010(G)     0.8939(d) 
198 [  plus    min    int32] : auto speedups     1.1063(G)     1.1257(G)     1.1372(G)     1.2153(G) speedups     0.9661(G)     0.8714(d)     0.9799(G)     0.9325(d) 
199 [  plus    min   uint32] : auto speedups     1.1123(G)     1.1612(G)     1.0980(G)     1.2036(G) speedups     0.9647(G)     0.8715(d)     0.9561(G)     0.9186(d) 
200 [  plus    min    int64] : auto speedups     1.0084(G)     1.0521(G)     1.0426(G)     1.0934(G) speedups     0.8780(G)     0.8644(d)     0.8514(G)     0.9184(d) 
201 [  plus    min   uint64] : auto speedups     1.0247(G)     1.0216(G)     1.0079(G)     1.1034(G) speedups     0.9576(G)     0.8077(d)     0.9583(G)     0.8807(d) 
202 [  plus    min   single] : auto speedups     1.0159(G)     1.0478(G)     0.9987(G)     1.1045(G) speedups     0.9922(G)     0.9975(d)     0.9482(G)     1.0768(d) 
203 [  plus    min   double] : auto speedups     0.9520(G)     0.9658(G)     0.9712(G)     1.0003(G) speedups     0.9567(G)     2.7585(d)     0.9455(G)     2.7764(d) 
204 [  plus    max  logical] : auto speedups     1.2017(G)     1.2326(G)     1.2134(G)     1.2882(G) speedups     0.9007(G)     1.0241(d)     0.8872(G)     1.0749(d) 
205 [  plus    max     int8] : auto speedups     1.0766(G)     1.1207(G)     1.1125(G)     1.2167(G) speedups     1.0079(G)     0.8733(d)     0.9601(G)     0.9194(d) 
206 [  plus    max    uint8] : auto speedups     1.0188(G)     1.0523(G)     1.0194(G)     1.1100(G) speedups     0.9609(G)     0.8513(d)     0.9523(G)     0.8857(d) 
207 [  plus    max    int16] : auto speedups     0.7977(G)     0.8238(G)     0.8077(G)     0.8745(G) speedups     0.9484(G)     0.8699(d)     0.9199(G)     0.9240(d) 
208 [  plus    max   uint16] : auto speedups     0.7420(G)     0.7591(G)     0.7476(G)     0.7967(G) speedups     0.8571(G)     0.8701(d)     0.8450(G)     0.9331(d) 
209 [  plus    max    int32] : auto speedups     1.1261(G)     1.1500(G)     1.1196(G)     1.1859(G) speedups     0.9638(G)     0.8831(d)     0.9534(G)     0.9193(d) 
210 [  plus    max   uint32] : auto speedups     1.0897(G)     1.1070(G)     1.0893(G)     1.1742(G) speedups     0.9422(G)     0.8734(d)     0.9483(G)     0.9285(d) 
211 [  plus    max    int64] : auto speedups     1.0407(G)     1.0378(G)     1.0298(G)     1.0958(G) speedups     0.9562(G)     0.8838(d)     0.9542(G)     0.9208(d) 
212 [  plus    max   uint64] : auto speedups     1.0084(G)     1.0038(G)     0.9917(G)     1.0440(G) speedups     0.8588(G)     0.8302(d)     0.8478(G)     0.8600(d) 
213 [  plus    max   single] : auto speedups     1.0121(G)     1.0326(G)     1.0384(G)     1.0882(G) speedups     0.9452(G)     1.0157(d)     0.9187(G)     1.0615(d) 
214 [  plus    max   double] : auto speedups     0.9600(G)     0.9634(G)     0.9868(G)     0.9992(G) speedups     0.8708(G)     2.7107(d)     0.8471(G)     2.7557(d) 
215 [  plus   plus  logical] : auto speedups     1.2144(G)     1.2016(G)     1.1875(G)     1.2955(G) speedups     0.9011(G)     1.0256(d)     0.8922(G)     1.0609(d) 
216 [  plus   plus     int8] : auto speedups     1.2004(G)     1.2051(G)     1.2078(G)     1.2649(G) speedups     0.9575(G)     0.8924(d)     0.9145(G)     0.9399(d) 
217 [  plus   plus    uint8] : auto speedups     1.1843(G)     1.1810(G)     1.2091(G)     1.1799(G) speedups     0.9084(G)     0.8947(d)     0.9034(G)     0.9436(d) 
218 [  plus   plus    int16] : auto speedups     1.2008(G)     1.1884(G)     1.2027(G)     1.2633(G) speedups     1.0003(G)     0.8961(d)     0.9705(G)     0.9234(d) 
219 [  plus   plus   uint16] : auto speedups     1.1749(G)     1.1852(G)     1.2053(G)     1.2665(G) speedups     1.0117(G)     0.8812(d)     0.9768(G)     0.9472(d) 
220 [  plus   plus    int32] : auto speedups     1.1907(G)     1.1863(G)     1.2360(G)     1.2647(G) speedups     0.9230(G)     0.8899(d)     0.9235(G)     0.9365(d) 
221 [  plus   plus   uint32] : auto speedups     1.1806(G)     1.1990(G)     1.2099(G)     1.2859(G) speedups     0.9033(G)     0.8553(d)     0.9075(G)     0.9399(d) 
222 [  plus   plus    int64] : auto speedups     1.0934(G)     1.0759(G)     1.1018(G)     1.1760(G) speedups     0.8812(G)     0.8991(d)     0.8812(G)     0.9408(d) 
223 [  plus   plus   uint64] : auto speedups     1.0851(G)     1.0745(G)     1.0806(G)     1.1471(G) speedups     0.8793(G)     0.8446(d)     0.8779(G)     0.8815(d) 
224 [  plus   plus   single] : auto speedups     1.1840(G)     1.2530(G)     1.1956(G)     1.3300(G) speedups     0.9033(G)     1.0260(d)     0.8995(G)     1.0977(d) 
225 [  plus   plus   double] : auto speedups     1.1385(G)     1.1479(G)     1.1288(G)     1.2239(G) speedups     0.8869(G)     2.5726(d)     0.8715(G)     2.6456(d) 
226 [  plus  times  logical] : auto speedups     1.0779(G)     1.1126(G)     1.0391(G)     1.1643(G) speedups     0.8771(G)     0.9877(d)     0.8438(G)     1.0520(d) 
227 [  plus  times     int8] : auto speedups     0.9819(G)     1.0138(G)     0.9952(G)     1.0583(G) speedups     0.9708(G)     0.8922(d)     0.9530(G)     0.9450(d) 
228 [  plus  times    uint8] : auto speedups     0.9770(G)     1.0046(G)     0.9998(G)     1.0565(G) speedups     0.9682(G)     0.9007(d)     0.9317(G)     0.9197(d) 
229 [  plus  times    int16] : auto speedups     1.1965(G)     1.1999(G)     1.1566(G)     1.2573(G) speedups     0.9077(G)     0.8738(d)     0.9117(G)     0.9163(d) 
230 [  plus  times   uint16] : auto speedups     1.1737(G)     1.1514(G)     1.1691(G)     1.2917(G) speedups     0.9900(G)     0.8907(d)     0.9622(G)     0.9212(d) 
231 [  plus  times    int32] : auto speedups     1.2029(G)     1.2029(G)     1.1967(G)     1.2827(G) speedups     0.9194(G)     0.8914(d)     0.8945(G)     0.9297(d) 
232 [  plus  times   uint32] : auto speedups     1.2015(G)     1.2074(G)     1.2236(G)     1.2905(G) speedups     0.9085(G)     0.8909(d)     0.8923(G)     0.9353(d) 
233 [  plus  times    int64] : auto speedups     1.0933(G)     1.0967(G)     1.0727(G)     1.1473(G) speedups     1.0080(G)     0.9040(d)     0.9870(G)     0.9245(d) 
234 [  plus  times   uint64] : auto speedups     1.1028(G)     1.1073(G)     1.1019(G)     1.1782(G) speedups     1.0038(G)     0.8434(d)     0.9689(G)     0.8775(d) 
235 [  plus  times   single] : auto speedups     0.6994(G)     0.7139(G)     0.7003(G)     0.7611(G) speedups     0.8935(G)     0.9570(d)     0.8767(G)     1.0085(d) 
236 [  plus  times   double] : auto speedups     0.7866(G)     0.7978(G)     0.7902(G)     0.8487(G) speedups     0.8303(G)     1.8782(d)     0.8261(G)     2.0171(d) 
237 [  plus     or  logical] : auto speedups     1.2086(G)     1.2329(G)     1.2109(G)     1.3033(G) speedups     0.9074(G)     1.0204(d)     0.8933(G)     1.0746(d) 
238 [  plus    and  logical] : auto speedups     1.0660(G)     1.1122(G)     1.0956(G)     1.1561(G) speedups     0.8852(G)     1.0121(d)     0.8574(G)     1.0530(d) 
239 [  plus    xor  logical] : auto speedups     1.1877(G)     1.2076(G)     1.1890(G)     1.2432(G) speedups     0.9095(G)     1.0368(d)     0.8821(G)     1.0998(d) 
240 [  plus     eq  logical] : auto speedups     0.9982(G)     1.0055(G)     1.0139(G)     1.0490(G) speedups     1.0175(G)     1.0038(d)     0.9857(G)     1.0625(d) 
241 [ minus    min  logical] : auto speedups     1.0866(G)     1.0968(G)     1.0933(G)     1.1621(G) speedups     0.9924(G)     1.0357(d)     0.9712(G)     1.0979(d) 
242 [ minus    min     int8] : auto speedups     1.0919(G)     1.1175(G)     1.0553(G)     1.1728(G) speedups     1.0765(G)     0.8632(d)     1.0773(G)     0.9217(d) 
243 [ minus    min    uint8] : auto speedups     1.0963(G)     1.1396(G)     1.1088(G)     1.1870(G) speedups     0.8966(G)     0.8442(d)     0.8884(G)     0.8981(d) 
244 [ minus    min    int16] : auto speedups     0.7984(G)     0.8193(G)     0.8062(G)     0.8686(G) speedups     0.8891(G)     0.8928(d)     0.8695(G)     0.9179(d) 
245 [ minus    min   uint16] : auto speedups     0.7990(G)     0.8131(G)     0.8061(G)     0.8682(G) speedups     0.9166(G)     0.8951(d)     0.9061(G)     0.9406(d) 
246 [ minus    min    int32] : auto speedups     1.1180(G)     1.1520(G)     1.1340(G)     1.2039(G) speedups     0.8842(G)     0.8841(d)     0.8777(G)     0.9323(d) 
247 [ minus    min   uint32] : auto speedups     1.1054(G)     1.1409(G)     1.1464(G)     1.1978(G) speedups     0.9011(G)     0.8914(d)     0.8864(G)     0.9294(d) 
248 [ minus    min    int64] : auto speedups     1.0315(G)     1.0439(G)     1.0069(G)     1.1190(G) speedups     0.8652(G)     0.8974(d)     0.8425(G)     0.9171(d) 
249 [ minus    min   uint64] : auto speedups     1.0305(G)     1.0484(G)     1.0246(G)     1.0950(G) speedups     0.9900(G)     0.8089(d)     0.9759(G)     0.8646(d) 
250 [ minus    min   single] : auto speedups     0.9934(G)     1.0422(G)     1.0317(G)     1.1022(G) speedups     0.9733(G)     1.0230(d)     0.9390(G)     1.0625(d) 
251 [ minus    min   double] : auto speedups     0.9675(G)     0.9518(G)     0.9664(G)     1.0378(G) speedups     0.9031(G)     2.5342(d)     0.8749(G)     2.5913(d) 
252 [ minus    max  logical] : auto speedups     1.1905(G)     1.2314(G)     1.1838(G)     1.3056(G) speedups     0.9037(G)     1.0134(d)     0.8583(G)     1.0569(d) 
253 [ minus    max     int8] : auto speedups     1.0930(G)     1.1194(G)     1.0758(G)     1.1964(G) speedups     1.0692(G)     0.8746(d)     1.0538(G)     0.9160(d) 
254 [ minus    max    uint8] : auto speedups     1.0149(G)     1.0289(G)     1.0042(G)     1.1012(G) speedups     1.0158(G)     0.8530(d)     0.9982(G)     0.8994(d) 
255 [ minus    max    int16] : auto speedups     0.7923(G)     0.8185(G)     0.8176(G)     0.8695(G) speedups     0.9357(G)     0.8772(d)     0.9138(G)     0.9362(d) 
256 [ minus    max   uint16] : auto speedups     0.7304(G)     0.7504(G)     0.7432(G)     0.7981(G) speedups     0.9324(G)     0.8782(d)     0.9253(G)     0.9334(d) 
257 [ minus    max    int32] : auto speedups     1.1183(G)     1.1419(G)     1.1280(G)     1.2110(G) speedups     0.9869(G)     0.8863(d)     0.9883(G)     0.9219(d) 
258 [ minus    max   uint32] : auto speedups     1.0570(G)     1.1177(G)     1.0680(G)     1.1668(G) speedups     1.0973(G)     0.8783(d)     1.1071(G)     0.9118(d) 
259 [ minus    max    int64] : auto speedups     1.0134(G)     1.0484(G)     1.0268(G)     1.1106(G) speedups     0.9727(G)     0.8783(d)     0.9361(G)     0.9338(d) 
260 [ minus    max   uint64] : auto speedups     0.8525(G)     0.9710(G)     0.9284(G)     1.0488(G) speedups     0.8600(G)     0.8273(d)     0.8470(G)     0.8658(d) 
261 [ minus    max   single] : auto speedups     1.0118(G)     1.0509(G)     0.9915(G)     1.0896(G) speedups     0.9432(G)     1.0111(d)     0.9167(G)     1.0616(d) 
262 [ minus    max   double] : auto speedups     0.9607(G)     0.9619(G)     0.9826(G)     1.0063(G) speedups     0.9811(G)     2.4289(d)     0.9075(G)     2.3771(d) 
263 [ minus   plus  logical] : auto speedups     1.0602(G)     1.1694(G)     1.1435(G)     1.2360(G) speedups     0.8789(G)     1.0290(d)     0.8894(G)     1.0597(d) 
264 [ minus   plus     int8] : auto speedups     1.1929(G)     1.1758(G)     1.2157(G)     1.2604(G) speedups     0.8838(G)     0.8788(d)     0.8664(G)     0.9280(d) 
265 [ minus   plus    uint8] : auto speedups     1.2033(G)     1.2117(G)     1.2130(G)     1.2908(G) speedups     0.8852(G)     0.8652(d)     0.8746(G)     0.8437(d) 
266 [ minus   plus    int16] : auto speedups     1.0885(G)     1.1409(G)     1.1629(G)     1.2453(G) speedups     0.8935(G)     0.8988(d)     0.8946(G)     0.9577(d) 
267 [ minus   plus   uint16] : auto speedups     1.1823(G)     1.2048(G)     1.1896(G)     1.2772(G) speedups     0.9572(G)     0.8835(d)     0.9353(G)     0.9221(d) 
268 [ minus   plus    int32] : auto speedups     1.1955(G)     1.2356(G)     1.2276(G)     1.2698(G) speedups     0.8968(G)     0.8819(d)     0.8771(G)     0.9328(d) 
269 [ minus   plus   uint32] : auto speedups     1.2055(G)     1.2294(G)     1.1838(G)     1.2762(G) speedups     0.9427(G)     0.8869(d)     0.9339(G)     0.9277(d) 
270 [ minus   plus    int64] : auto speedups     1.0799(G)     1.1112(G)     1.0807(G)     1.1801(G) speedups     0.8930(G)     0.8803(d)     0.8637(G)     0.9271(d) 
271 [ minus   plus   uint64] : auto speedups     1.0607(G)     1.0945(G)     1.0811(G)     1.1640(G) speedups     0.8858(G)     0.8391(d)     0.8468(G)     0.7950(d) 
272 [ minus   plus   single] : auto speedups     1.1815(G)     1.1898(G)     1.1914(G)     1.2298(G) speedups     0.9172(G)     1.0395(d)     0.9197(G)     1.0892(d) 
273 [ minus   plus   double] : auto speedups     1.1517(G)     1.0525(G)     1.1536(G)     1.2207(G) speedups     1.0079(G)     2.4915(d)     0.9979(G)     2.7046(d) 
274 [ minus  times  logical] : auto speedups     1.0891(G)     1.1153(G)     1.0810(G)     1.1728(G) speedups     0.9913(G)     1.0270(d)     0.9507(G)     1.0930(d) 
275 [ minus  times     int8] : auto speedups     0.8989(G)     0.9170(G)     0.8966(G)     0.9699(G) speedups     0.8455(G)     0.8953(d)     0.8298(G)     0.9417(d) 
276 [ minus  times    uint8] : auto speedups     0.8953(G)     0.9099(G)     0.9068(G)     0.9545(G) speedups     0.9657(G)     0.8951(d)     0.9596(G)     0.9384(d) 
277 [ minus  times    int16] : auto speedups     0.8916(G)     0.9121(G)     0.8990(G)     0.9578(G) speedups     0.9829(G)     0.8655(d)     0.9593(G)     0.9092(d) 
278 [ minus  times   uint16] : auto speedups     0.8958(G)     0.9012(G)     0.8914(G)     0.9597(G) speedups     1.0008(G)     0.8780(d)     0.9966(G)     0.9240(d) 
279 [ minus  times    int32] : auto speedups     1.2102(G)     1.2332(G)     1.1877(G)     1.3086(G) speedups     0.9207(G)     0.8948(d)     0.9093(G)     0.9295(d) 
280 [ minus  times   uint32] : auto speedups     1.1934(G)     1.2355(G)     1.2039(G)     1.3107(G) speedups     0.9059(G)     0.8975(d)     0.8819(G)     0.9290(d) 
281 [ minus  times    int64] : auto speedups     1.0826(G)     1.1102(G)     1.0787(G)     1.1608(G) speedups     0.8876(G)     0.8915(d)     0.8671(G)     0.9467(d) 
282 [ minus  times   uint64] : auto speedups     1.0907(G)     1.0883(G)     1.0976(G)     1.1190(G) speedups     0.9958(G)     0.8324(d)     0.9650(G)     0.8866(d) 
283 [ minus  times   single] : auto speedups     0.6823(G)     0.7133(G)     0.7070(G)     0.7438(G) speedups     0.9385(G)     0.9547(d)     0.9390(G)     0.9998(d) 
284 [ minus  times   double] : auto speedups     0.7882(G)     0.7757(G)     0.8028(G)     0.8445(G) speedups     0.8480(G)     1.8685(d)     0.8213(G)     1.9157(d) 
285 [ minus     or  logical] : auto speedups     1.1894(G)     1.2245(G)     1.2111(G)     1.2912(G) speedups     0.8987(G)     1.0114(d)     0.8743(G)     1.0888(d) 
286 [ minus    and  logical] : auto speedups     1.0791(G)     1.1268(G)     1.1019(G)     1.1931(G) speedups     0.9824(G)     1.0323(d)     0.9627(G)     1.0861(d) 
287 [ minus    xor  logical] : auto speedups     1.1153(G)     1.1210(G)     1.1384(G)     1.2179(G) speedups     0.8593(G)     1.0124(d)     0.8586(G)     1.0764(d) 
288 [ minus     eq  logical] : auto speedups     1.1118(G)     1.1108(G)     1.1023(G)     1.1802(G) speedups     0.9996(G)     1.0270(d)     0.9779(G)     1.0568(d) 
289 [ times    min  logical] : auto speedups     1.0551(G)     1.0378(G)     1.0675(G)     1.1214(G) speedups     1.0068(G)     1.0051(d)     0.9876(G)     1.0544(d) 
290 [ times    min     int8] : auto speedups     0.9294(G)     0.9490(G)     0.9204(G)     0.9620(G) speedups     0.9456(G)     0.8379(d)     0.9129(G)     0.9076(d) 
291 [ times    min    uint8] : auto speedups     0.9399(G)     0.9704(G)     0.9687(G)     1.0271(G) speedups     0.8367(G)     0.8542(d)     0.8285(G)     0.9256(d) 
292 [ times    min    int16] : auto speedups     0.6721(G)     0.6843(G)     0.6629(G)     0.7071(G) speedups     0.9037(G)     0.8342(d)     0.9399(G)     0.9006(d) 
293 [ times    min   uint16] : auto speedups     0.6733(G)     0.6907(G)     0.6782(G)     0.7250(G) speedups     0.9648(G)     0.8751(d)     0.9318(G)     0.9239(d) 
294 [ times    min    int32] : auto speedups     1.0963(G)     1.0905(G)     1.1209(G)     1.1786(G) speedups     0.9875(G)     0.8733(d)     0.9526(G)     0.9257(d) 
295 [ times    min   uint32] : auto speedups     1.1257(G)     1.1277(G)     1.1255(G)     1.1895(G) speedups     0.9659(G)     0.8831(d)     0.9512(G)     0.9263(d) 
296 [ times    min    int64] : auto speedups     1.0168(G)     0.9894(G)     1.0243(G)     1.1134(G) speedups     0.8697(G)     0.8711(d)     0.8543(G)     0.9058(d) 
297 [ times    min   uint64] : auto speedups     1.0139(G)     1.0305(G)     1.0319(G)     1.0585(G) speedups     0.9444(G)     0.8077(d)     0.9272(G)     0.8350(d) 
298 [ times    min   single] : auto speedups     0.9896(G)     1.0170(G)     1.0134(G)     1.0578(G) speedups     0.9150(G)     0.9842(d)     0.8984(G)     1.0205(d) 
299 [ times    min   double] : auto speedups     0.9450(G)     0.9577(G)     0.9442(G)     1.0275(G) speedups     0.8586(G)     2.5956(d)     0.8290(G)     2.6354(d) 
300 [ times    max  logical] : auto speedups     1.0188(G)     1.0331(G)     1.0522(G)     1.1047(G) speedups     1.0403(G)     1.0128(d)     1.0244(G)     1.0634(d) 
301 [ times    max     int8] : auto speedups     0.8954(G)     0.9247(G)     0.9133(G)     0.9759(G) speedups     0.9939(G)     0.8632(d)     0.9834(G)     0.9076(d) 
302 [ times    max    uint8] : auto speedups     0.9567(G)     0.9566(G)     0.9368(G)     1.0198(G) speedups     0.8361(G)     0.8658(d)     0.8142(G)     0.8651(d) 
303 [ times    max    int16] : auto speedups     0.6448(G)     0.6685(G)     0.6539(G)     0.7057(G) speedups     0.8699(G)     0.8478(d)     0.8437(G)     0.8950(d) 
304 [ times    max   uint16] : auto speedups     0.6134(G)     0.6285(G)     0.6212(G)     0.6729(G) speedups     0.8434(G)     0.8651(d)     0.8125(G)     0.8828(d) 
305 [ times    max    int32] : auto speedups     1.0791(G)     1.0098(G)     1.0638(G)     1.1086(G) speedups     0.6784(G)     0.7919(d)     0.7951(G)     0.8969(d) 
306 [ times    max   uint32] : auto speedups     1.0566(G)     1.0596(G)     1.0462(G)     1.1458(G) speedups     0.9369(G)     0.8632(d)     0.9206(G)     0.8932(d) 
307 [ times    max    int64] : auto speedups     1.0017(G)     1.0249(G)     1.0134(G)     1.0590(G) speedups     0.9458(G)     0.8577(d)     0.9295(G)     0.8941(d) 
308 [ times    max   uint64] : auto speedups     0.9657(G)     0.9598(G)     0.9533(G)     1.0255(G) speedups     0.9304(G)     0.8085(d)     0.9115(G)     0.8399(d) 
309 [ times    max   single] : auto speedups     1.0041(G)     1.0087(G)     1.0042(G)     1.0489(G) speedups     0.9322(G)     0.9832(d)     0.9209(G)     1.0214(d) 
310 [ times    max   double] : auto speedups     0.9325(G)     0.9466(G)     0.9351(G)     0.9932(G) speedups     1.0805(G)     2.5287(d)     1.0503(G)     2.5580(d) 
311 [ times   plus  logical] : auto speedups     1.0192(G)     1.0414(G)     1.0374(G)     1.1123(G) speedups     1.0413(G)     1.0058(d)     0.9853(G)     1.0674(d) 
312 [ times   plus     int8] : auto speedups     0.9317(G)     0.9610(G)     0.9613(G)     1.0230(G) speedups     0.8553(G)     0.8655(d)     0.8345(G)     0.9148(d) 
313 [ times   plus    uint8] : auto speedups     0.9313(G)     0.9701(G)     0.9634(G)     1.0306(G) speedups     0.9339(G)     0.8741(d)     0.9249(G)     0.9106(d) 
314 [ times   plus    int16] : auto speedups     0.9473(G)     0.9705(G)     0.9430(G)     1.0331(G) speedups     0.9155(G)     0.8550(d)     0.9133(G)     0.9145(d) 
315 [ times   plus   uint16] : auto speedups     0.9389(G)     0.9672(G)     0.9460(G)     1.0223(G) speedups     0.9672(G)     0.8801(d)     0.9525(G)     0.9133(d) 
316 [ times   plus    int32] : auto speedups     1.1995(G)     1.2163(G)     1.1905(G)     1.2357(G) speedups     0.8778(G)     0.8674(d)     0.8904(G)     0.9101(d) 
317 [ times   plus   uint32] : auto speedups     1.1873(G)     1.2053(G)     1.1893(G)     1.2480(G) speedups     0.9659(G)     0.8724(d)     0.9484(G)     0.8977(d) 
318 [ times   plus    int64] : auto speedups     1.0591(G)     1.0658(G)     1.0685(G)     1.1384(G) speedups     0.9048(G)     0.8606(d)     0.8723(G)     0.8913(d) 
319 [ times   plus   uint64] : auto speedups     1.0102(G)     1.0920(G)     1.0475(G)     1.1190(G) speedups     0.8415(G)     0.8100(d)     0.8226(G)     0.8509(d) 
320 [ times   plus   single] : auto speedups     1.1868(G)     1.2355(G)     1.2099(G)     1.2837(G) speedups     0.8796(G)     1.0205(d)     0.8680(G)     1.0508(d) 
321 [ times   plus   double] : auto speedups     1.1357(G)     1.1304(G)     1.1357(G)     1.1987(G) speedups     0.8681(G)     2.5220(d)     0.8564(G)     2.2861(d) 
322 [ times  times  logical] : auto speedups     0.9701(G)     0.9979(G)     0.9370(G)     1.0281(G) speedups     0.9684(G)     0.9704(d)     0.9303(G)     0.9849(d) 
323 [ times  times     int8] : auto speedups     0.7027(G)     0.7060(G)     0.7075(G)     0.7733(G) speedups     0.9985(G)     0.8482(d)     0.9382(G)     0.9088(d) 
324 [ times  times    uint8] : auto speedups     0.7133(G)     0.7234(G)     0.7259(G)     0.7834(G) speedups     1.0074(G)     0.8743(d)     0.9743(G)     0.9138(d) 
325 [ times  times    int16] : auto speedups     1.1538(G)     1.1730(G)     1.1885(G)     1.2312(G) speedups     0.9028(G)     0.8702(d)     0.8840(G)     0.8971(d) 
326 [ times  times   uint16] : auto speedups     1.1731(G)     1.1758(G)     1.1860(G)     1.2621(G) speedups     0.9119(G)     0.8553(d)     0.8835(G)     0.9213(d) 
327 [ times  times    int32] : auto speedups     1.1723(G)     1.2019(G)     1.2129(G)     1.2425(G) speedups     0.8451(G)     0.8858(d)     0.8227(G)     0.9122(d) 
328 [ times  times   uint32] : auto speedups     1.1671(G)     1.1958(G)     1.1957(G)     1.2577(G) speedups     0.8274(G)     0.8679(d)     0.8246(G)     0.9277(d) 
329 [ times  times    int64] : auto speedups     1.0530(G)     1.0857(G)     1.0587(G)     1.1413(G) speedups     0.9772(G)     0.8492(d)     0.9550(G)     0.9047(d) 
330 [ times  times   uint64] : auto speedups     1.0568(G)     1.0771(G)     1.0723(G)     1.1346(G) speedups     0.9677(G)     0.8078(d)     0.9470(G)     0.8527(d) 
331 [ times  times   single] : auto speedups     0.7915(G)     0.7944(G)     0.7939(G)     0.8382(G) speedups     0.8768(G)     0.9664(d)     0.8517(G)     1.0091(d) 
332 [ times  times   double] : auto speedups     0.7546(G)     0.7644(G)     0.7672(G)     0.8190(G) speedups     0.8517(G)     2.2065(d)     0.8347(G)     2.2897(d) 
333 [ times     or  logical] : auto speedups     0.9958(G)     1.0652(G)     1.0162(G)     1.1095(G) speedups     1.0395(G)     0.9894(d)     1.0076(G)     1.0657(d) 
334 [ times    and  logical] : auto speedups     1.0231(G)     1.0569(G)     1.0309(G)     1.1098(G) speedups     0.9854(G)     0.9771(d)     0.9724(G)     1.0265(d) 
335 [ times    xor  logical] : auto speedups     1.0730(G)     1.1398(G)     1.0903(G)     1.1667(G) speedups     1.0033(G)     0.9853(d)     0.9698(G)     1.0593(d) 
336 [ times     eq  logical] : auto speedups     1.1002(G)     1.1463(G)     1.1066(G)     1.2018(G) speedups     1.1011(G)     0.9995(d)     1.0571(G)     1.0389(d) 
337 [   div    min  logical] : auto speedups     1.0780(G)     1.1182(G)     1.1057(G)     1.1788(G) speedups     0.8974(G)     0.9968(d)     0.8782(G)     1.0579(d) 
338 [   div    min     int8] : auto speedups     0.8145(G)     0.8382(G)     0.8399(G)     0.8992(G) speedups     0.8964(G)     0.7571(d)     0.8675(G)     0.7889(d) 
339 [   div    min    uint8] : auto speedups     1.0089(G)     1.0388(G)     1.0314(G)     1.0849(G) speedups     0.9397(G)     0.7870(d)     0.9105(G)     0.8279(d) 
340 [   div    min    int16] : auto speedups     0.9246(G)     0.9467(G)     0.9262(G)     1.0011(G) speedups     0.8794(G)     0.7828(d)     0.8569(G)     0.8234(d) 
341 [   div    min   uint16] : auto speedups     0.9915(G)     1.0074(G)     0.9991(G)     1.0656(G) speedups     0.9573(G)     0.8135(d)     0.9169(G)     0.8386(d) 
342 [   div    min    int32] : auto speedups     0.9233(G)     0.9584(G)     0.9295(G)     1.0024(G) speedups     0.9323(G)     0.7712(d)     0.8986(G)     0.8194(d) 
343 [   div    min   uint32] : auto speedups     0.8334(G)     0.8500(G)     0.8406(G)     0.9042(G) speedups     0.8430(G)     0.8181(d)     0.8346(G)     0.8590(d) 
344 [   div    min    int64] : auto speedups     0.8198(G)     0.8542(G)     0.8346(G)     0.8636(G) speedups     0.8772(G)     0.7467(d)     0.8388(G)     0.7766(d) 
345 [   div    min   uint64] : auto speedups     0.7668(G)     0.7876(G)     0.7716(G)     0.8209(G) speedups     0.8153(G)     0.7454(d)     0.7979(G)     0.7917(d) 
346 [   div    min   single] : auto speedups     0.7121(G)     0.7229(G)     0.7173(G)     0.7502(G) speedups     1.0373(G)     0.9639(d)     1.0233(G)     1.0063(d) 
347 [   div    min   double] : auto speedups     0.4415(G)     0.4478(G)     0.4402(G)     0.4807(G) speedups     0.8588(G)     1.7835(d)     0.8387(G)     1.8546(d) 
348 [   div    max  logical] : auto speedups     1.2130(G)     1.2116(G)     1.2260(G)     1.2821(G) speedups     0.8496(G)     0.9898(d)     0.8239(G)     1.0881(d) 
349 [   div    max     int8] : auto speedups     0.8238(G)     0.8445(G)     0.8399(G)     0.8985(G) speedups     0.8632(G)     0.7558(d)     0.8495(G)     0.7863(d) 
350 [   div    max    uint8] : auto speedups     0.9913(G)     1.0149(G)     0.9902(G)     1.0795(G) speedups     0.8901(G)     0.7995(d)     0.8935(G)     0.8393(d) 
351 [   div    max    int16] : auto speedups     0.9034(G)     0.9349(G)     0.9302(G)     0.9684(G) speedups     0.8685(G)     0.7777(d)     0.8459(G)     0.8220(d) 
352 [   div    max   uint16] : auto speedups     0.9424(G)     0.9490(G)     0.9602(G)     1.0363(G) speedups     0.9600(G)     0.8220(d)     0.9566(G)     0.8596(d) 
353 [   div    max    int32] : auto speedups     0.9304(G)     0.9567(G)     0.9364(G)     0.9867(G) speedups     0.9634(G)     0.7708(d)     0.9489(G)     0.8122(d) 
354 [   div    max   uint32] : auto speedups     0.7710(G)     0.7756(G)     0.7720(G)     0.8155(G) speedups     0.8372(G)     0.8297(d)     0.8388(G)     0.8612(d) 
355 [   div    max    int64] : auto speedups     0.8206(G)     0.8418(G)     0.8352(G)     0.8906(G) speedups     0.7898(G)     0.7383(d)     0.7719(G)     0.7767(d) 
356 [   div    max   uint64] : auto speedups     0.7056(G)     0.7221(G)     0.7168(G)     0.7599(G) speedups     0.9231(G)     0.7421(d)     0.8998(G)     0.7880(d) 
357 [   div    max   single] : auto speedups     0.7226(G)     0.7327(G)     0.7264(G)     0.7750(G) speedups     0.9117(G)     0.9602(d)     0.8916(G)     1.0166(d) 
358 [   div    max   double] : auto speedups     0.4407(G)     0.4507(G)     0.4469(G)     0.4836(G) speedups     0.9400(G)     1.8609(d)     0.9316(G)     1.8985(d) 
359 [   div   plus  logical] : auto speedups     1.1765(G)     1.2236(G)     1.2445(G)     1.3149(G) speedups     0.8542(G)     1.0217(d)     0.8316(G)     1.0589(d) 
360 [   div   plus     int8] : auto speedups     1.0299(G)     1.0321(G)     1.0285(G)     1.0651(G) speedups     0.8485(G)     0.7858(d)     0.8249(G)     0.8204(d) 
361 [   div   plus    uint8] : auto speedups     1.0800(G)     1.1002(G)     1.1061(G)     1.1688(G) speedups     0.9550(G)     0.8228(d)     0.9362(G)     0.8584(d) 
362 [   div   plus    int16] : auto speedups     1.0116(G)     1.0476(G)     1.0269(G)     1.0823(G) speedups     0.7664(G)     0.8085(d)     0.7528(G)     0.8471(d) 
363 [   div   plus   uint16] : auto speedups     1.0431(G)     1.0598(G)     1.0546(G)     1.1363(G) speedups     0.9917(G)     0.8286(d)     0.9705(G)     0.8657(d) 
364 [   div   plus    int32] : auto speedups     1.0043(G)     1.0387(G)     1.0165(G)     1.0928(G) speedups     0.7944(G)     0.7756(d)     0.7872(G)     0.8170(d) 
365 [   div   plus   uint32] : auto speedups     1.0779(G)     1.0935(G)     1.0880(G)     1.1652(G) speedups     0.8895(G)     0.8320(d)     0.8682(G)     0.8829(d) 
366 [   div   plus    int64] : auto speedups     0.9030(G)     0.9005(G)     0.9134(G)     0.9657(G) speedups     0.8490(G)     0.7479(d)     0.8334(G)     0.7794(d) 
367 [   div   plus   uint64] : auto speedups     0.9446(G)     0.9546(G)     0.9386(G)     0.9993(G) speedups     0.8295(G)     0.7527(d)     0.8094(G)     0.7992(d) 
368 [   div   plus   single] : auto speedups     0.7166(G)     0.7315(G)     0.7201(G)     0.7617(G) speedups     0.8568(G)     0.9850(d)     0.8567(G)     1.0470(d) 
369 [   div   plus   double] : auto speedups     0.4484(G)     0.4553(G)     0.4503(G)     0.4840(G) speedups     0.9028(G)     1.8390(d)     0.8846(G)     1.9302(d) 
370 [   div  times  logical] : auto speedups     1.0724(G)     1.1090(G)     1.1003(G)     1.1764(G) speedups     0.9031(G)     1.0067(d)     0.8815(G)     1.0466(d) 
371 [   div  times     int8] : auto speedups     1.0244(G)     1.0401(G)     1.0300(G)     1.1044(G) speedups     0.9339(G)     0.7869(d)     0.9073(G)     0.8271(d) 
372 [   div  times    uint8] : auto speedups     0.9570(G)     0.9738(G)     0.9621(G)     1.0210(G) speedups     0.9573(G)     0.8256(d)     0.9344(G)     0.8621(d) 
373 [   div  times    int16] : auto speedups     1.0203(G)     1.0369(G)     1.0330(G)     1.1050(G) speedups     0.9367(G)     0.7995(d)     0.9092(G)     0.8169(d) 
374 [   div  times   uint16] : auto speedups     0.8188(G)     0.8376(G)     0.8343(G)     0.8836(G) speedups     1.0105(G)     0.8128(d)     0.9845(G)     0.8497(d) 
375 [   div  times    int32] : auto speedups     1.0206(G)     1.0400(G)     1.0202(G)     1.0976(G) speedups     0.8566(G)     0.7796(d)     0.8328(G)     0.8175(d) 
376 [   div  times   uint32] : auto speedups     0.8436(G)     0.8757(G)     0.8395(G)     0.8739(G) speedups     0.8507(G)     0.7826(d)     0.8759(G)     0.8157(d) 
377 [   div  times    int64] : auto speedups     0.8868(G)     0.9020(G)     0.9026(G)     0.9521(G) speedups     0.8864(G)     0.7302(d)     0.8675(G)     0.7793(d) 
378 [   div  times   uint64] : auto speedups     0.7306(G)     0.7464(G)     0.7364(G)     0.8158(G) speedups     0.9167(G)     0.7631(d)     0.8724(G)     0.7221(d) 
379 [   div  times   single] : auto speedups     0.5426(G)     0.5498(G)     0.5533(G)     0.6094(G) speedups     0.8044(G)     0.9137(d)     0.7659(G)     0.8864(d) 
380 [   div  times   double] : auto speedups     0.4251(G)     0.4377(G)     0.4334(G)     0.4764(G) speedups     0.9490(G)     1.6838(d)     0.9169(G)     1.8372(d) 
381 [   div     or  logical] : auto speedups     1.1346(G)     1.2368(G)     1.1826(G)     1.2958(G) speedups     0.8351(G)     1.0092(d)     0.8218(G)     1.0562(d) 
382 [   div    and  logical] : auto speedups     1.0836(G)     1.0516(G)     1.0446(G)     1.1340(G) speedups     0.8827(G)     0.9794(d)     0.8736(G)     1.0342(d) 
383 [   div    xor  logical] : auto speedups     1.1642(G)     1.1977(G)     1.1872(G)     1.2884(G) speedups     0.8694(G)     1.0075(d)     0.8668(G)     1.0646(d) 
384 [   div     eq  logical] : auto speedups     1.0981(G)     1.1271(G)     1.1008(G)     1.1859(G) speedups     0.8850(G)     1.0076(d)     0.8748(G)     1.0502(d) 
385 [  iseq    min  logical] : auto speedups     1.0633(G)     1.1031(G)     1.0627(G)     1.1665(G) speedups     0.9379(G)     0.9834(d)     0.9399(G)     1.0378(d) 
386 [  iseq    min     int8] : auto speedups     0.9780(G)     1.0036(G)     0.9911(G)     1.0577(G) speedups     0.9289(G)     0.8467(d)     0.8936(G)     0.8802(d) 
387 [  iseq    min    uint8] : auto speedups     0.9667(G)     0.9741(G)     0.9843(G)     1.0417(G) speedups     0.9422(G)     0.8090(d)     0.9200(G)     0.8042(d) 
388 [  iseq    min    int16] : auto speedups     1.0172(G)     1.0533(G)     1.0363(G)     1.1189(G) speedups     0.8300(G)     0.8564(d)     0.8056(G)     0.8965(d) 
389 [  iseq    min   uint16] : auto speedups     1.0358(G)     1.0613(G)     1.0512(G)     1.1157(G) speedups     1.0098(G)     0.8572(d)     1.0002(G)     0.8890(d) 
390 [  iseq    min    int32] : auto speedups     1.0583(G)     1.0710(G)     1.0456(G)     1.1341(G) speedups     0.9649(G)     0.8575(d)     0.9403(G)     0.8966(d) 
391 [  iseq    min   uint32] : auto speedups     1.0510(G)     1.0779(G)     1.0541(G)     1.1348(G) speedups     0.9542(G)     0.8617(d)     0.9381(G)     0.8790(d) 
392 [  iseq    min    int64] : auto speedups     0.8168(G)     0.8211(G)     0.8157(G)     0.8572(G) speedups     0.8554(G)     0.8571(d)     0.8069(G)     0.8772(d) 
393 [  iseq    min   uint64] : auto speedups     0.7863(G)     0.8230(G)     0.8234(G)     0.8550(G) speedups     0.9387(G)     0.7984(d)     0.9169(G)     0.8404(d) 
394 [  iseq    min   single] : auto speedups     0.9434(G)     0.9580(G)     0.9688(G)     1.0069(G) speedups     1.0447(G)     0.9735(d)     1.0028(G)     1.0141(d) 
395 [  iseq    min   double] : auto speedups     0.9122(G)     0.9194(G)     0.9039(G)     0.9831(G) speedups     1.0420(G)     2.1869(d)     1.0268(G)     2.2645(d) 
396 [  iseq    max  logical] : auto speedups     1.1721(G)     1.2025(G)     1.1995(G)     1.2792(G) speedups     0.9132(G)     0.9951(d)     0.8716(G)     1.0651(d) 
397 [  iseq    max     int8] : auto speedups     1.0568(G)     1.0928(G)     1.0788(G)     1.1624(G) speedups     0.8471(G)     0.8363(d)     0.8368(G)     0.8912(d) 
398 [  iseq    max    uint8] : auto speedups     1.0486(G)     1.0734(G)     1.0616(G)     1.1502(G) speedups     0.8588(G)     0.8050(d)     0.8288(G)     0.8541(d) 
399 [  iseq    max    int16] : auto speedups     1.0289(G)     1.0524(G)     1.0274(G)     1.1040(G) speedups     0.9006(G)     0.8423(d)     0.8708(G)     0.9023(d) 
400 [  iseq    max   uint16] : auto speedups     0.9736(G)     1.0100(G)     1.0021(G)     1.0866(G) speedups     0.8905(G)     0.8571(d)     0.8690(G)     0.8785(d) 
401 [  iseq    max    int32] : auto speedups     1.0369(G)     1.0620(G)     1.0675(G)     1.1149(G) speedups     0.8065(G)     0.8646(d)     0.8241(G)     0.9069(d) 
402 [  iseq    max   uint32] : auto speedups     1.0147(G)     1.0243(G)     1.0155(G)     1.0750(G) speedups     0.9482(G)     0.8616(d)     0.9278(G)     0.9005(d) 
403 [  iseq    max    int64] : auto speedups     0.8151(G)     0.8278(G)     0.8167(G)     0.8780(G) speedups     0.9466(G)     0.8520(d)     0.9213(G)     0.8909(d) 
404 [  iseq    max   uint64] : auto speedups     0.7361(G)     0.7377(G)     0.6139(G)     0.6942(G) speedups     0.8140(G)     0.7658(d)     0.8911(G)     0.8303(d) 
405 [  iseq    max   single] : auto speedups     0.9183(G)     0.9499(G)     0.8215(G)     0.9459(G) speedups     0.9489(G)     0.8178(d)     0.8931(G)     0.9762(d) 
406 [  iseq    max   double] : auto speedups     0.8682(G)     0.7757(G)     0.8842(G)     0.9671(G) speedups     1.0530(G)     2.1649(d)     0.9994(G)     2.2077(d) 
407 [  iseq   plus  logical] : auto speedups     1.1465(G)     1.1750(G)     1.1432(G)     1.1487(G) speedups     0.9034(G)     0.9705(d)     0.8911(G)     0.9584(d) 
408 [  iseq   plus     int8] : auto speedups     0.8214(G)     1.0041(G)     1.0941(G)     1.1567(G) speedups     0.8984(G)     0.7750(d)     0.8513(G)     0.8601(d) 
409 [  iseq   plus    uint8] : auto speedups     1.0949(G)     1.1006(G)     1.0461(G)     1.0720(G) speedups     0.8593(G)     0.8362(d)     0.8506(G)     0.8497(d) 
410 [  iseq   plus    int16] : auto speedups     1.0662(G)     1.0339(G)     1.0623(G)     1.0826(G) speedups     0.7764(G)     0.7830(d)     0.7558(G)     0.8410(d) 
411 [  iseq   plus   uint16] : auto speedups     1.0419(G)     1.0649(G)     1.0775(G)     1.0936(G) speedups     0.7840(G)     0.8322(d)     0.7749(G)     0.8541(d) 
412 [  iseq   plus    int32] : auto speedups     1.0276(G)     1.0397(G)     1.0314(G)     1.1983(G) speedups     0.8102(G)     0.7941(d)     0.8020(G)     0.8465(d) 
413 [  iseq   plus   uint32] : auto speedups     1.0187(G)     1.0396(G)     1.0428(G)     1.1198(G) speedups     0.8206(G)     0.8022(d)     0.6602(G)     0.8676(d) 
414 [  iseq   plus    int64] : auto speedups     0.8207(G)     1.0310(G)     1.0166(G)     1.0699(G) speedups     0.6796(G)     0.7772(d)     0.7628(G)     0.8788(d) 
415 [  iseq   plus   uint64] : auto speedups     0.9382(G)     0.9524(G)     0.9222(G)     1.0540(G) speedups     0.7858(G)     0.7766(d)     0.7877(G)     0.8074(d) 
416 [  iseq   plus   single] : auto speedups     1.1232(G)     1.1408(G)     1.0842(G)     1.1364(G) speedups     0.9843(G)     0.9345(d)     0.9226(G)     1.0093(d) 
417 [  iseq   plus   double] : auto speedups     0.9131(G)     0.9165(G)     0.8955(G)     0.9554(G) speedups     0.9702(G)     1.8412(d)     0.9421(G)     1.9030(d) 
418 [  iseq  times  logical] : auto speedups     1.0046(G)     1.0076(G)     0.9791(G)     1.0334(G) speedups     0.9315(G)     0.9867(d)     0.9246(G)     1.0217(d) 
419 [  iseq  times     int8] : auto speedups     1.0994(G)     1.1149(G)     1.0907(G)     1.1670(G) speedups     0.9604(G)     0.8608(d)     0.9375(G)     0.9015(d) 
420 [  iseq  times    uint8] : auto speedups     1.1412(G)     1.1507(G)     1.1409(G)     1.2101(G) speedups     0.9396(G)     0.8572(d)     0.9215(G)     0.9190(d) 
421 [  iseq  times    int16] : auto speedups     1.1424(G)     1.1571(G)     1.1384(G)     1.2199(G) speedups     0.9685(G)     0.8421(d)     0.9407(G)     0.9060(d) 
422 [  iseq  times   uint16] : auto speedups     1.0662(G)     1.0961(G)     1.1075(G)     1.1599(G) speedups     0.8594(G)     0.8718(d)     0.8368(G)     0.9088(d) 
423 [  iseq  times    int32] : auto speedups     1.1660(G)     1.1707(G)     1.1701(G)     1.2462(G) speedups     0.8431(G)     0.8566(d)     0.8136(G)     0.8940(d) 
424 [  iseq  times   uint32] : auto speedups     1.1438(G)     1.1570(G)     1.1301(G)     1.2634(G) speedups     0.8333(G)     0.8678(d)     0.8277(G)     0.8926(d) 
425 [  iseq  times    int64] : auto speedups     1.0251(G)     1.0684(G)     1.0508(G)     1.1326(G) speedups     0.8189(G)     0.8591(d)     0.7983(G)     0.9039(d) 
426 [  iseq  times   uint64] : auto speedups     1.0488(G)     1.0432(G)     1.0196(G)     1.1181(G) speedups     0.8251(G)     0.8113(d)     0.7947(G)     0.8489(d) 
427 [  iseq  times   single] : auto speedups     1.1439(G)     1.1795(G)     1.1681(G)     1.2025(G) speedups     1.0742(G)     1.0106(d)     1.0423(G)     1.0706(d) 
428 [  iseq  times   double] : auto speedups     1.0889(G)     1.1224(G)     1.0962(G)     1.1859(G) speedups     1.0667(G)     2.0587(d)     1.0368(G)     2.2109(d) 
429 [  iseq     or  logical] : auto speedups     1.1688(G)     1.2013(G)     1.1885(G)     1.2347(G) speedups     0.8953(G)     1.0148(d)     0.8807(G)     1.0479(d) 
430 [  iseq    and  logical] : auto speedups     1.0820(G)     1.0860(G)     1.0941(G)     1.1517(G) speedups     0.9498(G)     0.9915(d)     0.9294(G)     1.0226(d) 
431 [  iseq    xor  logical] : auto speedups     1.0847(G)     1.1126(G)     1.0947(G)     1.1466(G) speedups     0.9254(G)     0.9983(d)     0.9111(G)     1.0348(d) 
432 [  iseq     eq  logical] : auto speedups     1.0979(G)     1.1194(G)     1.0856(G)     1.1569(G) speedups     0.9253(G)     0.9884(d)     0.9103(G)     1.0324(d) 
433 [  isne    min  logical] : auto speedups     1.0707(G)     1.0801(G)     1.0736(G)     1.1577(G) speedups     0.9549(G)     1.0157(d)     0.9446(G)     1.0595(d) 
434 [  isne    min     int8] : auto speedups     1.0664(G)     1.1008(G)     1.0874(G)     1.1765(G) speedups     0.9125(G)     0.8503(d)     0.8985(G)     0.8961(d) 
435 [  isne    min    uint8] : auto speedups     1.0747(G)     1.0811(G)     1.0675(G)     1.1499(G) speedups     0.8513(G)     0.8053(d)     0.8250(G)     0.8652(d) 
436 [  isne    min    int16] : auto speedups     1.0238(G)     1.0380(G)     1.0394(G)     1.1071(G) speedups     0.9095(G)     0.8467(d)     0.8623(G)     0.8882(d) 
437 [  isne    min   uint16] : auto speedups     1.0498(G)     1.0423(G)     1.0365(G)     1.1172(G) speedups     0.8090(G)     0.8456(d)     0.8011(G)     0.8948(d) 
438 [  isne    min    int32] : auto speedups     0.9966(G)     1.0379(G)     1.0448(G)     1.0994(G) speedups     0.9406(G)     0.8503(d)     0.7606(G)     0.8831(d) 
439 [  isne    min   uint32] : auto speedups     1.0426(G)     1.0714(G)     1.0353(G)     1.1393(G) speedups     0.9519(G)     0.8437(d)     0.9570(G)     0.8524(d) 
440 [  isne    min    int64] : auto speedups     0.7497(G)     0.7714(G)     0.7938(G)     0.8689(G) speedups     0.7976(G)     0.7955(d)     0.7743(G)     0.8821(d) 
441 [  isne    min   uint64] : auto speedups     0.7943(G)     0.7692(G)     0.8046(G)     0.8334(G) speedups     0.7934(G)     0.7875(d)     0.7955(G)     0.8260(d) 
442 [  isne    min   single] : auto speedups     0.9533(G)     0.9671(G)     0.9437(G)     1.0261(G) speedups     0.9990(G)     0.9740(d)     0.9694(G)     0.9154(d) 
443 [  isne    min   double] : auto speedups     0.8640(G)     0.8556(G)     0.9189(G)     0.9550(G) speedups     0.9818(G)     2.2925(d)     0.9525(G)     2.3655(d) 
444 [  isne    max  logical] : auto speedups     1.1595(G)     1.1947(G)     1.1540(G)     1.2617(G) speedups     0.8859(G)     0.9962(d)     0.8761(G)     1.0174(d) 
445 [  isne    max     int8] : auto speedups     0.9581(G)     0.9849(G)     0.9678(G)     1.0229(G) speedups     0.9348(G)     0.8421(d)     0.9056(G)     0.8868(d) 
446 [  isne    max    uint8] : auto speedups     0.9795(G)     0.9819(G)     0.9834(G)     1.0135(G) speedups     0.8976(G)     0.8064(d)     0.8912(G)     0.8570(d) 
447 [  isne    max    int16] : auto speedups     1.0286(G)     1.0468(G)     1.0527(G)     1.0786(G) speedups     0.9101(G)     0.8287(d)     0.8766(G)     0.8748(d) 
448 [  isne    max   uint16] : auto speedups     0.9916(G)     0.9810(G)     0.9854(G)     1.0653(G) speedups     0.7984(G)     0.8265(d)     0.7793(G)     0.8968(d) 
449 [  isne    max    int32] : auto speedups     1.0560(G)     1.0561(G)     1.0517(G)     1.1044(G) speedups     0.9608(G)     0.8482(d)     0.9378(G)     0.9032(d) 
450 [  isne    max   uint32] : auto speedups     1.0028(G)     1.0099(G)     0.9917(G)     1.0800(G) speedups     0.9368(G)     0.8394(d)     0.9266(G)     0.8974(d) 
451 [  isne    max    int64] : auto speedups     0.8091(G)     0.8094(G)     0.8084(G)     0.8714(G) speedups     0.9540(G)     0.8494(d)     0.9210(G)     0.8938(d) 
452 [  isne    max   uint64] : auto speedups     0.7352(G)     0.7466(G)     0.7371(G)     0.7846(G) speedups     0.9276(G)     0.7974(d)     0.9068(G)     0.8258(d) 
453 [  isne    max   single] : auto speedups     0.9609(G)     0.9604(G)     0.9386(G)     1.0077(G) speedups     1.0353(G)     0.9631(d)     1.0215(G)     1.0079(d) 
454 [  isne    max   double] : auto speedups     0.9108(G)     0.9253(G)     0.9198(G)     0.9677(G) speedups     1.0395(G)     2.0513(d)     0.9505(G)     2.2502(d) 
455 [  isne   plus  logical] : auto speedups     1.0631(G)     1.1418(G)     1.1249(G)     1.2368(G) speedups     0.8834(G)     0.9844(d)     0.8657(G)     1.0403(d) 
456 [  isne   plus     int8] : auto speedups     1.1456(G)     1.1454(G)     1.1473(G)     1.2134(G) speedups     0.8688(G)     0.8705(d)     0.8536(G)     0.9014(d) 
457 [  isne   plus    uint8] : auto speedups     1.1591(G)     1.1557(G)     1.1503(G)     1.2131(G) speedups     0.8591(G)     0.8472(d)     0.8337(G)     0.9058(d) 
458 [  isne   plus    int16] : auto speedups     1.1081(G)     1.1391(G)     1.1231(G)     1.1946(G) speedups     0.9146(G)     0.8697(d)     0.8879(G)     0.9010(d) 
459 [  isne   plus   uint16] : auto speedups     1.1033(G)     1.1430(G)     1.1112(G)     1.1994(G) speedups     0.8975(G)     0.8584(d)     0.8854(G)     0.8993(d) 
460 [  isne   plus    int32] : auto speedups     1.1373(G)     1.1366(G)     1.1070(G)     1.2119(G) speedups     0.8588(G)     0.8495(d)     0.8524(G)     0.9071(d) 
461 [  isne   plus   uint32] : auto speedups     1.1127(G)     1.1384(G)     1.1119(G)     1.1567(G) speedups     0.8541(G)     0.8565(d)     0.8136(G)     0.9006(d) 
462 [  isne   plus    int64] : auto speedups     1.0558(G)     1.0356(G)     1.0263(G)     1.1062(G) speedups     0.9016(G)     0.8718(d)     0.8655(G)     0.9113(d) 
463 [  isne   plus   uint64] : auto speedups     1.0366(G)     1.0294(G)     1.0207(G)     1.1126(G) speedups     0.8829(G)     0.8171(d)     0.8765(G)     0.8527(d) 
464 [  isne   plus   single] : auto speedups     1.1738(G)     1.1774(G)     1.1700(G)     1.2384(G) speedups     1.0348(G)     1.0061(d)     1.0021(G)     1.0286(d) 
465 [  isne   plus   double] : auto speedups     1.0164(G)     1.0206(G)     1.0436(G)     1.1145(G) speedups     1.0616(G)     2.1503(d)     1.0306(G)     2.2787(d) 
466 [  isne  times  logical] : auto speedups     1.0727(G)     1.0860(G)     1.0672(G)     1.1589(G) speedups     0.9572(G)     1.0068(d)     0.9425(G)     1.0741(d) 
467 [  isne  times     int8] : auto speedups     1.1839(G)     1.2012(G)     1.1918(G)     1.2654(G) speedups     0.9664(G)     0.8621(d)     0.9567(G)     0.9082(d) 
468 [  isne  times    uint8] : auto speedups     1.1882(G)     1.1643(G)     1.1824(G)     1.2771(G) speedups     0.8506(G)     0.8386(d)     0.8296(G)     0.8886(d) 
469 [  isne  times    int16] : auto speedups     1.1694(G)     1.1646(G)     1.1680(G)     1.2445(G) speedups     0.9496(G)     0.8585(d)     0.9182(G)     0.9075(d) 
470 [  isne  times   uint16] : auto speedups     1.1934(G)     1.1793(G)     1.1954(G)     1.2675(G) speedups     0.8912(G)     0.8635(d)     0.8715(G)     0.8969(d) 
471 [  isne  times    int32] : auto speedups     1.2204(G)     1.2246(G)     1.2179(G)     1.2739(G) speedups     0.8275(G)     0.8553(d)     0.8143(G)     0.8899(d) 
472 [  isne  times   uint32] : auto speedups     1.2399(G)     1.2284(G)     1.2111(G)     1.2912(G) speedups     0.8225(G)     0.8468(d)     0.8163(G)     0.8950(d) 
473 [  isne  times    int64] : auto speedups     1.0733(G)     1.1095(G)     1.0949(G)     1.1722(G) speedups     0.8154(G)     0.8578(d)     0.7944(G)     0.9121(d) 
474 [  isne  times   uint64] : auto speedups     1.0992(G)     1.1198(G)     1.1003(G)     1.1699(G) speedups     0.8219(G)     0.8117(d)     0.7993(G)     0.8502(d) 
475 [  isne  times   single] : auto speedups     1.1813(G)     1.1677(G)     1.1713(G)     1.2464(G) speedups     1.0420(G)     1.0068(d)     1.0166(G)     1.0601(d) 
476 [  isne  times   double] : auto speedups     1.0924(G)     1.1261(G)     1.1221(G)     1.1776(G) speedups     1.0579(G)     1.9796(d)     1.0360(G)     2.0350(d) 
477 [  isne     or  logical] : auto speedups     1.1882(G)     1.2024(G)     1.1697(G)     1.2604(G) speedups     0.8864(G)     1.0023(d)     0.8624(G)     1.0512(d) 
478 [  isne    and  logical] : auto speedups     1.0569(G)     1.0982(G)     1.0823(G)     1.1558(G) speedups     0.9583(G)     1.0186(d)     0.9234(G)     1.0673(d) 
479 [  isne    xor  logical] : auto speedups     1.1017(G)     1.1161(G)     1.0855(G)     1.1778(G) speedups     0.8525(G)     1.0075(d)     0.8340(G)     1.0703(d) 
480 [  isne     eq  logical] : auto speedups     1.0715(G)     1.0662(G)     1.0724(G)     1.1440(G) speedups     0.9619(G)     0.9913(d)     0.9543(G)     1.0467(d) 
481 [  isgt    min  logical] : auto speedups     1.0539(G)     1.0494(G)     1.0619(G)     1.1179(G) speedups     0.9528(G)     1.0195(d)     0.9392(G)     1.0627(d) 
482 [  isgt    min     int8] : auto speedups     1.0708(G)     1.0581(G)     1.0799(G)     1.1380(G) speedups     0.8565(G)     0.8271(d)     0.8332(G)     0.8849(d) 
483 [  isgt    min    uint8] : auto speedups     1.0735(G)     1.0946(G)     0.9893(G)     1.1385(G) speedups     0.8210(G)     0.8269(d)     0.7679(G)     0.8694(d) 
484 [  isgt    min    int16] : auto speedups     1.0342(G)     1.0462(G)     1.0479(G)     1.1198(G) speedups     0.8164(G)     0.8455(d)     0.8035(G)     0.8956(d) 
485 [  isgt    min   uint16] : auto speedups     1.0092(G)     1.0481(G)     1.0379(G)     1.1022(G) speedups     0.8574(G)     0.8332(d)     0.8334(G)     0.8749(d) 
486 [  isgt    min    int32] : auto speedups     1.0648(G)     1.0565(G)     1.0494(G)     1.1273(G) speedups     0.9531(G)     0.8172(d)     0.8746(G)     0.8200(d) 
487 [  isgt    min   uint32] : auto speedups     0.7463(G)     0.7500(G)     0.7190(G)     0.7953(G) speedups     0.9196(G)     0.8160(d)     0.9063(G)     0.8960(d) 
488 [  isgt    min    int64] : auto speedups     0.8166(G)     0.8115(G)     0.8136(G)     0.8646(G) speedups     0.8431(G)     0.8437(d)     0.8166(G)     0.8884(d) 
489 [  isgt    min   uint64] : auto speedups     0.7327(G)     0.7544(G)     0.7399(G)     0.7812(G) speedups     0.9174(G)     0.7820(d)     0.9223(G)     0.8173(d) 
490 [  isgt    min   single] : auto speedups     0.8516(G)     0.9277(G)     0.9548(G)     1.0089(G) speedups     0.9424(G)     0.9799(d)     0.9334(G)     0.9902(d) 
491 [  isgt    min   double] : auto speedups     0.9172(G)     0.9064(G)     0.9048(G)     0.9651(G) speedups     0.8645(G)     2.5939(d)     0.8453(G)     2.5711(d) 
492 [  isgt    max  logical] : auto speedups     1.1324(G)     1.1519(G)     1.1563(G)     1.2193(G) speedups     0.9146(G)     0.9904(d)     0.8874(G)     1.0271(d) 
493 [  isgt    max     int8] : auto speedups     0.9629(G)     0.9709(G)     0.9562(G)     1.0372(G) speedups     0.8989(G)     0.8487(d)     0.8911(G)     0.8812(d) 
494 [  isgt    max    uint8] : auto speedups     0.9884(G)     1.0326(G)     1.0056(G)     1.0814(G) speedups     0.9129(G)     0.8345(d)     0.8810(G)     0.8731(d) 
495 [  isgt    max    int16] : auto speedups     1.0456(G)     1.0324(G)     1.0470(G)     1.1167(G) speedups     0.8982(G)     0.8581(d)     0.8791(G)     0.8941(d) 
496 [  isgt    max   uint16] : auto speedups     0.9932(G)     0.9981(G)     0.9871(G)     1.0766(G) speedups     0.8613(G)     0.8427(d)     0.8489(G)     0.8892(d) 
497 [  isgt    max    int32] : auto speedups     1.0385(G)     1.0636(G)     1.0596(G)     1.1113(G) speedups     0.9478(G)     0.8505(d)     0.9322(G)     0.8921(d) 
498 [  isgt    max   uint32] : auto speedups     0.7136(G)     0.7355(G)     0.7029(G)     0.7729(G) speedups     0.9330(G)     0.8467(d)     0.9052(G)     0.8939(d) 
499 [  isgt    max    int64] : auto speedups     0.8160(G)     0.8205(G)     0.8102(G)     0.8672(G) speedups     0.9472(G)     0.8544(d)     0.9266(G)     0.9049(d) 
500 [  isgt    max   uint64] : auto speedups     0.6776(G)     0.6885(G)     0.6795(G)     0.7282(G) speedups     0.9180(G)     0.8044(d)     0.9016(G)     0.8261(d) 
501 [  isgt    max   single] : auto speedups     0.9342(G)     0.9512(G)     0.9445(G)     0.9803(G) speedups     0.9565(G)     0.9794(d)     0.9343(G)     1.0080(d) 
502 [  isgt    max   double] : auto speedups     0.8659(G)     0.9102(G)     0.8992(G)     0.9670(G) speedups     0.8640(G)     2.5214(d)     0.8426(G)     2.6214(d) 
503 [  isgt   plus  logical] : auto speedups     1.1297(G)     1.1703(G)     1.1372(G)     1.2317(G) speedups     0.9018(G)     0.9830(d)     0.8828(G)     1.0574(d) 
504 [  isgt   plus     int8] : auto speedups     1.1216(G)     1.1629(G)     1.1029(G)     1.2016(G) speedups     0.8569(G)     0.8510(d)     0.8284(G)     0.8946(d) 
505 [  isgt   plus    uint8] : auto speedups     1.1539(G)     1.1185(G)     1.1466(G)     1.2286(G) speedups     0.8600(G)     0.8695(d)     0.8267(G)     0.8950(d) 
506 [  isgt   plus    int16] : auto speedups     1.1088(G)     1.1223(G)     1.1104(G)     1.2052(G) speedups     0.8856(G)     0.8450(d)     0.8714(G)     0.9054(d) 
507 [  isgt   plus   uint16] : auto speedups     1.0497(G)     1.0896(G)     1.0695(G)     1.1260(G) speedups     0.9636(G)     0.8590(d)     0.9261(G)     0.8973(d) 
508 [  isgt   plus    int32] : auto speedups     1.0822(G)     1.1329(G)     1.1317(G)     1.1938(G) speedups     0.8543(G)     0.8503(d)     0.8473(G)     0.8978(d) 
509 [  isgt   plus   uint32] : auto speedups     1.0604(G)     1.0564(G)     1.0646(G)     1.1363(G) speedups     0.9535(G)     0.8249(d)     0.9442(G)     0.8918(d) 
510 [  isgt   plus    int64] : auto speedups     1.0257(G)     1.0542(G)     1.0337(G)     1.1065(G) speedups     0.8249(G)     0.8697(d)     0.8114(G)     0.9196(d) 
511 [  isgt   plus   uint64] : auto speedups     0.9659(G)     0.9573(G)     0.9594(G)     1.0167(G) speedups     0.8963(G)     0.8011(d)     0.8760(G)     0.8539(d) 
512 [  isgt   plus   single] : auto speedups     1.1659(G)     1.1898(G)     1.1418(G)     1.2632(G) speedups     0.9095(G)     0.9755(d)     0.8909(G)     1.0624(d) 
513 [  isgt   plus   double] : auto speedups     1.1080(G)     1.1126(G)     1.0975(G)     1.1697(G) speedups     0.8302(G)     2.4418(d)     0.7998(G)     2.5144(d) 
514 [  isgt  times  logical] : auto speedups     1.0540(G)     1.0744(G)     1.0518(G)     1.1146(G) speedups     0.9598(G)     1.0253(d)     0.9247(G)     1.0574(d) 
515 [  isgt  times     int8] : auto speedups     1.2060(G)     1.1912(G)     1.1972(G)     1.2469(G) speedups     0.8533(G)     0.8747(d)     0.8308(G)     0.8895(d) 
516 [  isgt  times    uint8] : auto speedups     1.1973(G)     1.2069(G)     1.2064(G)     1.2864(G) speedups     1.0274(G)     0.8690(d)     0.9932(G)     0.8960(d) 
517 [  isgt  times    int16] : auto speedups     1.2198(G)     1.2291(G)     1.2448(G)     1.3271(G) speedups     0.9628(G)     0.8505(d)     0.9439(G)     0.8997(d) 
518 [  isgt  times   uint16] : auto speedups     1.1990(G)     1.2093(G)     1.1989(G)     1.2935(G) speedups     0.9573(G)     0.8683(d)     0.9475(G)     0.9036(d) 
519 [  isgt  times    int32] : auto speedups     1.2293(G)     1.2002(G)     1.2202(G)     1.3165(G) speedups     0.8354(G)     0.8413(d)     0.8230(G)     0.8954(d) 
520 [  isgt  times   uint32] : auto speedups     1.2306(G)     1.2144(G)     1.2203(G)     1.2750(G) speedups     0.8184(G)     0.8592(d)     0.8015(G)     0.8939(d) 
521 [  isgt  times    int64] : auto speedups     1.1226(G)     1.0868(G)     1.0997(G)     1.1482(G) speedups     0.8109(G)     0.8547(d)     0.7960(G)     0.9171(d) 
522 [  isgt  times   uint64] : auto speedups     1.0840(G)     1.1108(G)     1.0779(G)     1.1418(G) speedups     0.8177(G)     0.7970(d)     0.8034(G)     0.8124(d) 
523 [  isgt  times   single] : auto speedups     1.1491(G)     1.1707(G)     1.1654(G)     1.2170(G) speedups     0.9630(G)     1.0088(d)     0.9391(G)     1.0505(d) 
524 [  isgt  times   double] : auto speedups     1.1059(G)     1.0636(G)     1.0994(G)     1.1539(G) speedups     0.8854(G)     2.5454(d)     0.8622(G)     2.6209(d) 
525 [  isgt     or  logical] : auto speedups     1.1508(G)     1.1544(G)     1.1638(G)     1.2217(G) speedups     0.8896(G)     1.0000(d)     0.8797(G)     1.0471(d) 
526 [  isgt    and  logical] : auto speedups     1.0642(G)     1.0530(G)     1.0240(G)     1.1248(G) speedups     0.9430(G)     1.0190(d)     0.9181(G)     1.0654(d) 
527 [  isgt    xor  logical] : auto speedups     1.0773(G)     1.0417(G)     1.0193(G)     1.0482(G) speedups     0.9036(G)     0.9196(d)     0.9299(G)     1.0461(d) 
528 [  isgt     eq  logical] : auto speedups     1.0453(G)     1.0673(G)     1.0435(G)     1.1081(G) speedups     0.9784(G)     0.9726(d)     0.9431(G)     1.0490(d) 
529 [  islt    min  logical] : auto speedups     0.9946(G)     1.0506(G)     1.0573(G)     1.1036(G) speedups     0.9045(G)     1.0085(d)     0.8734(G)     1.0822(d) 
530 [  islt    min     int8] : auto speedups     1.0540(G)     1.0873(G)     1.0767(G)     1.1360(G) speedups     0.8538(G)     0.8415(d)     0.8185(G)     0.8903(d) 
531 [  islt    min    uint8] : auto speedups     1.0884(G)     1.1178(G)     1.0787(G)     1.1795(G) speedups     0.8620(G)     0.8314(d)     0.8432(G)     0.8843(d) 
532 [  islt    min    int16] : auto speedups     1.0262(G)     1.0234(G)     1.0372(G)     1.0973(G) speedups     0.8950(G)     0.8535(d)     0.8844(G)     0.8978(d) 
533 [  islt    min   uint16] : auto speedups     1.0487(G)     1.0495(G)     1.0359(G)     1.1135(G) speedups     0.9024(G)     0.8360(d)     0.9192(G)     0.8877(d) 
534 [  islt    min    int32] : auto speedups     1.0578(G)     1.0476(G)     1.0483(G)     0.9734(G) speedups     0.9274(G)     0.8497(d)     0.9268(G)     0.8910(d) 
535 [  islt    min   uint32] : auto speedups     0.7789(G)     0.7977(G)     0.7795(G)     0.8458(G) speedups     0.9472(G)     0.8469(d)     0.9130(G)     0.8761(d) 
536 [  islt    min    int64] : auto speedups     0.8124(G)     0.8079(G)     0.8078(G)     0.8649(G) speedups     0.8192(G)     0.8622(d)     0.8060(G)     0.8974(d) 
537 [  islt    min   uint64] : auto speedups     0.7364(G)     0.7504(G)     0.7355(G)     0.7892(G) speedups     0.9317(G)     0.7897(d)     0.9177(G)     0.8310(d) 
538 [  islt    min   single] : auto speedups     0.9530(G)     0.9624(G)     0.9578(G)     1.0130(G) speedups     0.9002(G)     0.9842(d)     0.9119(G)     1.0162(d) 
539 [  islt    min   double] : auto speedups     0.9030(G)     0.9136(G)     0.9088(G)     0.9752(G) speedups     0.9356(G)     2.4157(d)     0.9046(G)     2.3537(d) 
540 [  islt    max  logical] : auto speedups     1.1215(G)     1.1168(G)     1.1137(G)     1.1663(G) speedups     0.8949(G)     0.9305(d)     0.8282(G)     0.9447(d) 
541 [  islt    max     int8] : auto speedups     0.8869(G)     0.9341(G)     0.9158(G)     0.9950(G) speedups     0.9157(G)     0.8439(d)     0.9073(G)     0.8960(d) 
542 [  islt    max    uint8] : auto speedups     0.9954(G)     1.0294(G)     0.9945(G)     1.0437(G) speedups     0.9047(G)     0.8257(d)     0.8948(G)     0.8636(d) 
543 [  islt    max    int16] : auto speedups     1.0361(G)     1.0250(G)     1.0288(G)     1.1281(G) speedups     1.0048(G)     0.8569(d)     0.9876(G)     0.8849(d) 
544 [  islt    max   uint16] : auto speedups     0.9741(G)     1.0059(G)     0.9801(G)     1.0618(G) speedups     0.9152(G)     0.8365(d)     0.8892(G)     0.8873(d) 
545 [  islt    max    int32] : auto speedups     1.0455(G)     1.0584(G)     1.0464(G)     1.1196(G) speedups     0.8278(G)     0.8478(d)     0.8234(G)     0.8884(d) 
546 [  islt    max   uint32] : auto speedups     0.7060(G)     0.7240(G)     0.7141(G)     0.7654(G) speedups     0.9334(G)     0.8403(d)     0.9113(G)     0.8888(d) 
547 [  islt    max    int64] : auto speedups     0.7910(G)     0.8096(G)     0.8129(G)     0.8776(G) speedups     0.9446(G)     0.8446(d)     0.9135(G)     0.8882(d) 
548 [  islt    max   uint64] : auto speedups     0.6874(G)     0.6757(G)     0.6776(G)     0.7299(G) speedups     0.7900(G)     0.7793(d)     0.7874(G)     0.8361(d) 
549 [  islt    max   single] : auto speedups     0.9555(G)     0.9639(G)     0.9485(G)     1.0215(G) speedups     0.9467(G)     0.9618(d)     0.9408(G)     1.0335(d) 
550 [  islt    max   double] : auto speedups     0.9188(G)     0.9139(G)     0.9165(G)     0.9718(G) speedups     0.8314(G)     2.4890(d)     0.8100(G)     2.5515(d) 
551 [  islt   plus  logical] : auto speedups     1.1635(G)     1.1483(G)     1.1451(G)     1.2371(G) speedups     0.9229(G)     0.9956(d)     0.8917(G)     1.0477(d) 
552 [  islt   plus     int8] : auto speedups     1.1485(G)     1.1433(G)     1.1644(G)     1.2190(G) speedups     0.8671(G)     0.8629(d)     0.8469(G)     0.9028(d) 
553 [  islt   plus    uint8] : auto speedups     1.1517(G)     1.1507(G)     1.1251(G)     1.2193(G) speedups     0.8659(G)     0.8667(d)     0.8523(G)     0.9010(d) 
554 [  islt   plus    int16] : auto speedups     1.1193(G)     1.1544(G)     1.0977(G)     1.2010(G) speedups     0.9037(G)     0.8675(d)     0.8835(G)     0.9180(d) 
555 [  islt   plus   uint16] : auto speedups     1.0601(G)     1.0860(G)     1.0789(G)     1.1477(G) speedups     0.9232(G)     0.8556(d)     0.8935(G)     0.8981(d) 
556 [  islt   plus    int32] : auto speedups     1.1287(G)     1.1208(G)     1.1420(G)     1.1948(G) speedups     0.8552(G)     0.8584(d)     0.8366(G)     0.8888(d) 
557 [  islt   plus   uint32] : auto speedups     1.0362(G)     1.0747(G)     1.0391(G)     1.1314(G) speedups     0.9575(G)     0.8646(d)     0.9411(G)     0.9110(d) 
558 [  islt   plus    int64] : auto speedups     1.0554(G)     1.0702(G)     0.9967(G)     1.1051(G) speedups     0.8279(G)     0.8519(d)     0.8120(G)     0.9080(d) 
559 [  islt   plus   uint64] : auto speedups     0.9564(G)     0.9749(G)     0.9614(G)     1.0217(G) speedups     0.8412(G)     0.7926(d)     0.8190(G)     0.8388(d) 
560 [  islt   plus   single] : auto speedups     1.1744(G)     1.1976(G)     1.1675(G)     1.2142(G) speedups     0.8274(G)     0.9952(d)     0.8165(G)     1.0388(d) 
561 [  islt   plus   double] : auto speedups     1.1049(G)     1.1081(G)     1.0727(G)     1.1681(G) speedups     0.8871(G)     2.4678(d)     0.8725(G)     2.6485(d) 
562 [  islt  times  logical] : auto speedups     1.0294(G)     1.0521(G)     1.0581(G)     1.0847(G) speedups     0.8876(G)     1.0270(d)     0.8681(G)     1.0478(d) 
563 [  islt  times     int8] : auto speedups     1.2493(G)     1.2104(G)     1.2621(G)     1.3111(G) speedups     0.9282(G)     0.8269(d)     0.9184(G)     0.8547(d) 
564 [  islt  times    uint8] : auto speedups     1.1400(G)     1.1813(G)     1.1867(G)     1.2596(G) speedups     0.9703(G)     0.8173(d)     0.9408(G)     0.8717(d) 
565 [  islt  times    int16] : auto speedups     1.0737(G)     1.1583(G)     1.1481(G)     1.2019(G) speedups     0.8826(G)     0.8500(d)     0.8509(G)     0.8783(d) 
566 [  islt  times   uint16] : auto speedups     1.2895(G)     1.2756(G)     1.2377(G)     1.3566(G) speedups     0.9499(G)     0.8505(d)     0.9461(G)     0.8936(d) 
567 [  islt  times    int32] : auto speedups     1.2117(G)     1.2352(G)     1.2117(G)     1.2900(G) speedups     0.8149(G)     0.8515(d)     0.8125(G)     0.8891(d) 
568 [  islt  times   uint32] : auto speedups     1.2443(G)     1.2350(G)     1.2491(G)     1.3218(G) speedups     0.8313(G)     0.8523(d)     0.7919(G)     0.9021(d) 
569 [  islt  times    int64] : auto speedups     1.1084(G)     1.1024(G)     1.0910(G)     1.1761(G) speedups     0.8076(G)     0.8665(d)     0.7973(G)     0.9096(d) 
570 [  islt  times   uint64] : auto speedups     1.1238(G)     1.1268(G)     1.1273(G)     1.1892(G) speedups     0.8098(G)     0.7996(d)     0.8040(G)     0.8329(d) 
571 [  islt  times   single] : auto speedups     1.1562(G)     1.1804(G)     1.1555(G)     1.2317(G) speedups     0.9112(G)     0.9811(d)     0.8969(G)     1.0522(d) 
572 [  islt  times   double] : auto speedups     1.0904(G)     1.1161(G)     1.0883(G)     1.1688(G) speedups     0.8795(G)     2.5222(d)     0.8706(G)     2.5693(d) 
573 [  islt     or  logical] : auto speedups     1.1219(G)     1.1682(G)     1.1377(G)     1.2349(G) speedups     0.9090(G)     0.9866(d)     0.8931(G)     1.0470(d) 
574 [  islt    and  logical] : auto speedups     1.0569(G)     1.0744(G)     1.0481(G)     1.1175(G) speedups     0.8984(G)     1.0089(d)     0.8758(G)     1.0743(d) 
575 [  islt    xor  logical] : auto speedups     1.0682(G)     1.0995(G)     1.0641(G)     1.1578(G) speedups     0.8337(G)     0.9955(d)     0.8188(G)     1.0602(d) 
576 [  islt     eq  logical] : auto speedups     1.0309(G)     1.0725(G)     1.0491(G)     1.1291(G) speedups     0.8391(G)     0.9838(d)     0.8085(G)     1.0612(d) 
577 [  isge    min  logical] : auto speedups     1.0576(G)     1.0977(G)     1.0694(G)     1.1378(G) speedups     0.8356(G)     1.0063(d)     0.8126(G)     1.0386(d) 
578 [  isge    min     int8] : auto speedups     0.9302(G)     0.9533(G)     0.9343(G)     0.9928(G) speedups     0.9342(G)     0.8336(d)     0.9219(G)     0.8967(d) 
579 [  isge    min    uint8] : auto speedups     0.9870(G)     1.0060(G)     0.9965(G)     1.0538(G) speedups     0.9067(G)     0.8122(d)     0.8844(G)     0.8756(d) 
580 [  isge    min    int16] : auto speedups     0.9988(G)     1.0544(G)     1.0325(G)     1.1069(G) speedups     0.8044(G)     0.8456(d)     0.8010(G)     0.8983(d) 
581 [  isge    min   uint16] : auto speedups     1.0395(G)     1.0382(G)     1.0454(G)     1.1077(G) speedups     0.8212(G)     0.8531(d)     0.7965(G)     0.9123(d) 
582 [  isge    min    int32] : auto speedups     1.0543(G)     1.0506(G)     1.0475(G)     1.1123(G) speedups     0.9641(G)     0.8393(d)     0.9311(G)     0.8738(d) 
583 [  isge    min   uint32] : auto speedups     1.0611(G)     1.0462(G)     1.0410(G)     1.1209(G) speedups     0.9408(G)     0.8416(d)     0.9323(G)     0.8997(d) 
584 [  isge    min    int64] : auto speedups     0.8097(G)     0.8258(G)     0.8034(G)     0.8703(G) speedups     0.9388(G)     0.8405(d)     0.9298(G)     0.9017(d) 
585 [  isge    min   uint64] : auto speedups     0.8015(G)     0.8131(G)     0.8081(G)     0.8693(G) speedups     0.9428(G)     0.8025(d)     0.9191(G)     0.8349(d) 
586 [  isge    min   single] : auto speedups     0.9423(G)     0.9702(G)     0.9356(G)     1.0148(G) speedups     0.9318(G)     0.9715(d)     0.9083(G)     1.0263(d) 
587 [  isge    min   double] : auto speedups     0.8988(G)     0.9133(G)     0.9007(G)     0.9648(G) speedups     0.9329(G)     2.4027(d)     0.9196(G)     2.4899(d) 
588 [  isge    max  logical] : auto speedups     1.1547(G)     1.1712(G)     1.1372(G)     1.2342(G) speedups     0.8366(G)     1.0107(d)     0.8302(G)     1.0767(d) 
589 [  isge    max     int8] : auto speedups     1.0189(G)     1.0749(G)     1.0814(G)     1.1381(G) speedups     0.8492(G)     0.8441(d)     0.8241(G)     0.8894(d) 
590 [  isge    max    uint8] : auto speedups     1.0956(G)     1.1130(G)     1.1142(G)     1.1671(G) speedups     0.8708(G)     0.8152(d)     0.8595(G)     0.8883(d) 
591 [  isge    max    int16] : auto speedups     1.0131(G)     1.0531(G)     1.0437(G)     1.0987(G) speedups     0.9057(G)     0.8259(d)     0.8794(G)     0.8943(d) 
592 [  isge    max   uint16] : auto speedups     0.9966(G)     1.0016(G)     0.9752(G)     1.0726(G) speedups     0.8016(G)     0.8455(d)     0.7877(G)     0.8928(d) 
593 [  isge    max    int32] : auto speedups     1.0457(G)     1.0658(G)     1.0295(G)     1.1185(G) speedups     0.9472(G)     0.8522(d)     0.9240(G)     0.8819(d) 
594 [  isge    max   uint32] : auto speedups     1.0198(G)     1.0103(G)     1.0263(G)     1.0653(G) speedups     0.9393(G)     0.8532(d)     0.9321(G)     0.8760(d) 
595 [  isge    max    int64] : auto speedups     0.8035(G)     0.8038(G)     0.8117(G)     0.8673(G) speedups     0.9387(G)     0.8436(d)     0.9343(G)     0.8870(d) 
596 [  isge    max   uint64] : auto speedups     0.7366(G)     0.7443(G)     0.7316(G)     0.7863(G) speedups     0.9394(G)     0.7891(d)     0.9092(G)     0.8369(d) 
597 [  isge    max   single] : auto speedups     0.9490(G)     0.9301(G)     0.9324(G)     1.0061(G) speedups     0.9632(G)     0.9681(d)     0.9443(G)     1.0165(d) 
598 [  isge    max   double] : auto speedups     0.9103(G)     0.8979(G)     0.9081(G)     0.9566(G) speedups     0.8466(G)     2.5808(d)     0.8421(G)     2.6532(d) 
599 [  isge   plus  logical] : auto speedups     1.1536(G)     1.1522(G)     1.1581(G)     1.2223(G) speedups     0.8383(G)     1.0137(d)     0.8171(G)     1.0570(d) 
600 [  isge   plus     int8] : auto speedups     1.1449(G)     1.1478(G)     1.1478(G)     1.2088(G) speedups     0.8642(G)     0.8454(d)     0.8458(G)     0.9046(d) 
601 [  isge   plus    uint8] : auto speedups     1.1305(G)     1.1699(G)     1.1519(G)     1.2112(G) speedups     0.8569(G)     0.8624(d)     0.8516(G)     0.9115(d) 
602 [  isge   plus    int16] : auto speedups     1.0967(G)     1.1213(G)     1.1247(G)     1.2008(G) speedups     0.8897(G)     0.8564(d)     0.8905(G)     0.8964(d) 
603 [  isge   plus   uint16] : auto speedups     1.1141(G)     1.1484(G)     1.0894(G)     1.1962(G) speedups     0.8463(G)     0.8455(d)     0.8339(G)     0.9017(d) 
604 [  isge   plus    int32] : auto speedups     1.1303(G)     1.1491(G)     1.1113(G)     1.1928(G) speedups     0.8508(G)     0.8605(d)     0.8423(G)     0.9088(d) 
605 [  isge   plus   uint32] : auto speedups     1.1125(G)     1.1353(G)     1.1290(G)     1.2151(G) speedups     0.8600(G)     0.8089(d)     0.8152(G)     0.8977(d) 
606 [  isge   plus    int64] : auto speedups     0.9255(G)     1.0637(G)     1.0419(G)     1.1125(G) speedups     0.8254(G)     0.8657(d)     0.8113(G)     0.9222(d) 
607 [  isge   plus   uint64] : auto speedups     1.0328(G)     1.0263(G)     1.0428(G)     1.0909(G) speedups     0.8854(G)     0.8063(d)     0.8755(G)     0.8595(d) 
608 [  isge   plus   single] : auto speedups     1.1724(G)     1.1732(G)     1.1706(G)     1.2446(G) speedups     0.8985(G)     1.0184(d)     0.8828(G)     1.0211(d) 
609 [  isge   plus   double] : auto speedups     1.1276(G)     1.1117(G)     1.1012(G)     1.1541(G) speedups     0.8260(G)     2.2790(d)     0.8090(G)     2.5133(d) 
610 [  isge  times  logical] : auto speedups     1.0760(G)     1.0882(G)     1.0712(G)     1.1658(G) speedups     0.8456(G)     0.9936(d)     0.8134(G)     1.0471(d) 
611 [  isge  times     int8] : auto speedups     1.1370(G)     1.1471(G)     1.1267(G)     1.2143(G) speedups     0.9374(G)     0.8597(d)     0.9210(G)     0.9099(d) 
612 [  isge  times    uint8] : auto speedups     1.1468(G)     1.1656(G)     1.1498(G)     1.2120(G) speedups     1.0109(G)     0.8610(d)     0.9849(G)     0.9039(d) 
613 [  isge  times    int16] : auto speedups     1.0664(G)     1.0870(G)     1.0716(G)     1.1621(G) speedups     0.9005(G)     0.8598(d)     0.8767(G)     0.9023(d) 
614 [  isge  times   uint16] : auto speedups     1.1909(G)     1.2197(G)     1.1966(G)     1.2644(G) speedups     0.9585(G)     0.8728(d)     0.9335(G)     0.9069(d) 
615 [  isge  times    int32] : auto speedups     1.1503(G)     1.1725(G)     1.1652(G)     1.2182(G) speedups     0.8282(G)     0.8552(d)     0.8282(G)     0.8926(d) 
616 [  isge  times   uint32] : auto speedups     1.1859(G)     1.2087(G)     1.1899(G)     1.2350(G) speedups     0.8155(G)     0.7985(d)     0.7779(G)     0.8442(d) 
617 [  isge  times    int64] : auto speedups     0.9962(G)     1.0292(G)     1.0334(G)     1.1189(G) speedups     0.8054(G)     0.8753(d)     0.8105(G)     0.9283(d) 
618 [  isge  times   uint64] : auto speedups     1.0646(G)     1.0602(G)     1.0780(G)     1.1410(G) speedups     0.8196(G)     0.8170(d)     0.7973(G)     0.8540(d) 
619 [  isge  times   single] : auto speedups     1.1687(G)     1.1717(G)     1.1529(G)     1.2334(G) speedups     0.9147(G)     0.9920(d)     0.8857(G)     1.0383(d) 
620 [  isge  times   double] : auto speedups     1.1093(G)     1.0980(G)     1.0751(G)     1.1630(G) speedups     0.9639(G)     2.3774(d)     0.9353(G)     2.5274(d) 
621 [  isge     or  logical] : auto speedups     1.1585(G)     1.1586(G)     1.1668(G)     1.2342(G) speedups     0.8475(G)     1.0211(d)     0.8288(G)     1.0657(d) 
622 [  isge    and  logical] : auto speedups     1.0843(G)     1.0849(G)     1.0833(G)     1.1676(G) speedups     0.8370(G)     0.9968(d)     0.8252(G)     1.0531(d) 
623 [  isge    xor  logical] : auto speedups     1.0744(G)     1.0956(G)     1.0865(G)     1.1317(G) speedups     0.8926(G)     0.9967(d)     0.8807(G)     1.0612(d) 
624 [  isge     eq  logical] : auto speedups     1.1037(G)     1.1152(G)     1.1117(G)     1.1816(G) speedups     0.9011(G)     1.0052(d)     0.8903(G)     1.0642(d) 
625 [  isle    min  logical] : auto speedups     1.0625(G)     1.0816(G)     1.0535(G)     1.1246(G) speedups     0.8951(G)     1.0062(d)     0.8874(G)     1.0634(d) 
626 [  isle    min     int8] : auto speedups     0.9422(G)     0.9532(G)     0.9428(G)     1.0013(G) speedups     0.9136(G)     0.8368(d)     0.8824(G)     0.8915(d) 
627 [  isle    min    uint8] : auto speedups     0.9841(G)     0.9986(G)     0.9792(G)     1.0511(G) speedups     0.9189(G)     0.8259(d)     0.9083(G)     0.8673(d) 
628 [  isle    min    int16] : auto speedups     1.0429(G)     1.0610(G)     1.0489(G)     1.0998(G) speedups     0.8137(G)     0.8569(d)     0.7979(G)     0.9012(d) 
629 [  isle    min   uint16] : auto speedups     1.0221(G)     1.0044(G)     0.9918(G)     1.0763(G) speedups     0.8039(G)     0.8243(d)     0.7879(G)     0.8829(d) 
630 [  isle    min    int32] : auto speedups     1.0488(G)     1.0487(G)     1.0451(G)     1.1170(G) speedups     0.9500(G)     0.8542(d)     0.9408(G)     0.8985(d) 
631 [  isle    min   uint32] : auto speedups     1.0395(G)     1.0392(G)     0.9857(G)     1.0984(G) speedups     0.9414(G)     0.8484(d)     0.9392(G)     0.8887(d) 
632 [  isle    min    int64] : auto speedups     0.8060(G)     0.8226(G)     0.8109(G)     0.8646(G) speedups     0.9352(G)     0.8545(d)     0.9125(G)     0.9028(d) 
633 [  isle    min   uint64] : auto speedups     0.7998(G)     0.8202(G)     0.7987(G)     0.8655(G) speedups     0.9302(G)     0.8044(d)     0.9215(G)     0.8340(d) 
634 [  isle    min   single] : auto speedups     0.9334(G)     0.9731(G)     0.9393(G)     0.9978(G) speedups     0.9770(G)     0.9793(d)     0.9423(G)     1.0220(d) 
635 [  isle    min   double] : auto speedups     0.9127(G)     0.9180(G)     0.9087(G)     0.9765(G) speedups     0.8560(G)     2.4807(d)     0.8342(G)     2.5398(d) 
636 [  isle    max  logical] : auto speedups     1.1246(G)     1.1256(G)     1.1328(G)     1.2144(G) speedups     0.9152(G)     0.9931(d)     0.8891(G)     1.0580(d) 
637 [  isle    max     int8] : auto speedups     1.0571(G)     1.0796(G)     1.0712(G)     1.1283(G) speedups     0.8486(G)     0.8504(d)     0.8264(G)     0.8922(d) 
638 [  isle    max    uint8] : auto speedups     1.0632(G)     1.0494(G)     1.0588(G)     1.1189(G) speedups     0.9183(G)     0.8281(d)     0.8962(G)     0.8732(d) 
639 [  isle    max    int16] : auto speedups     1.0304(G)     1.0390(G)     1.0381(G)     1.1146(G) speedups     0.8142(G)     0.8581(d)     0.8012(G)     0.9034(d) 
640 [  isle    max   uint16] : auto speedups     0.9608(G)     0.9741(G)     0.9710(G)     1.0325(G) speedups     0.8020(G)     0.8480(d)     0.7865(G)     0.8841(d) 
641 [  isle    max    int32] : auto speedups     1.0304(G)     1.0382(G)     1.0602(G)     1.1373(G) speedups     0.9590(G)     0.8542(d)     0.9454(G)     0.9095(d) 
642 [  isle    max   uint32] : auto speedups     0.9521(G)     0.9881(G)     0.9697(G)     1.0421(G) speedups     0.9476(G)     0.8544(d)     0.9222(G)     0.9028(d) 
643 [  isle    max    int64] : auto speedups     0.7734(G)     0.8132(G)     0.8075(G)     0.8702(G) speedups     0.9422(G)     0.8405(d)     0.8435(G)     0.8923(d) 
644 [  isle    max   uint64] : auto speedups     0.7346(G)     0.7519(G)     0.7451(G)     0.7929(G) speedups     0.8313(G)     0.7811(d)     0.8169(G)     0.8281(d) 
645 [  isle    max   single] : auto speedups     0.9533(G)     0.9557(G)     0.9566(G)     1.0142(G) speedups     0.9632(G)     0.9657(d)     0.9459(G)     1.0240(d) 
646 [  isle    max   double] : auto speedups     0.9143(G)     0.9084(G)     0.9114(G)     0.9716(G) speedups     0.8413(G)     2.4303(d)     0.8266(G)     2.5046(d) 
647 [  isle   plus  logical] : auto speedups     1.1404(G)     1.1292(G)     1.1179(G)     1.1962(G) speedups     0.9203(G)     1.0053(d)     0.8963(G)     1.0583(d) 
648 [  isle   plus     int8] : auto speedups     1.1423(G)     1.1667(G)     1.1489(G)     1.2031(G) speedups     0.8738(G)     0.8660(d)     0.8493(G)     0.9095(d) 
649 [  isle   plus    uint8] : auto speedups     1.1212(G)     1.1303(G)     1.0444(G)     1.1961(G) speedups     0.8662(G)     0.8638(d)     0.8484(G)     0.9133(d) 
650 [  isle   plus    int16] : auto speedups     1.1153(G)     1.1217(G)     1.1358(G)     1.2158(G) speedups     0.8413(G)     0.8400(d)     0.8310(G)     0.8911(d) 
651 [  isle   plus   uint16] : auto speedups     1.0593(G)     1.0855(G)     1.0723(G)     1.1353(G) speedups     0.9848(G)     0.8616(d)     0.9626(G)     0.9029(d) 
652 [  isle   plus    int32] : auto speedups     1.1320(G)     1.1377(G)     1.1312(G)     1.2008(G) speedups     1.0312(G)     0.8628(d)     0.9906(G)     0.9068(d) 
653 [  isle   plus   uint32] : auto speedups     1.0838(G)     1.0979(G)     1.0932(G)     1.1744(G) speedups     0.8557(G)     0.8647(d)     0.8453(G)     0.9053(d) 
654 [  isle   plus    int64] : auto speedups     1.0464(G)     1.0456(G)     1.0286(G)     1.1168(G) speedups     0.8985(G)     0.8683(d)     0.8710(G)     0.9085(d) 
655 [  isle   plus   uint64] : auto speedups     1.0034(G)     1.0225(G)     0.9977(G)     1.0680(G) speedups     0.8914(G)     0.7874(d)     0.8659(G)     0.8321(d) 
656 [  isle   plus   single] : auto speedups     1.1840(G)     1.1873(G)     1.1907(G)     1.2660(G) speedups     0.8960(G)     1.0026(d)     0.8839(G)     1.0604(d) 
657 [  isle   plus   double] : auto speedups     1.1076(G)     1.1121(G)     1.1087(G)     1.1752(G) speedups     0.9186(G)     2.6044(d)     0.8886(G)     2.6049(d) 
658 [  isle  times  logical] : auto speedups     1.0565(G)     1.0790(G)     1.0543(G)     1.1290(G) speedups     0.8915(G)     0.9887(d)     0.8766(G)     1.0535(d) 
659 [  isle  times     int8] : auto speedups     1.0951(G)     1.0995(G)     1.0833(G)     1.1670(G) speedups     0.9869(G)     0.8578(d)     0.9453(G)     0.9176(d) 
660 [  isle  times    uint8] : auto speedups     1.1375(G)     1.1555(G)     1.1345(G)     1.2090(G) speedups     0.9437(G)     0.8670(d)     0.9158(G)     0.9087(d) 
661 [  isle  times    int16] : auto speedups     1.1446(G)     1.1645(G)     1.1427(G)     1.2115(G) speedups     0.9617(G)     0.8662(d)     0.9395(G)     0.9059(d) 
662 [  isle  times   uint16] : auto speedups     1.0774(G)     1.0743(G)     1.0279(G)     1.0938(G) speedups     0.8561(G)     0.8281(d)     0.8067(G)     0.8243(d) 
663 [  isle  times    int32] : auto speedups     1.0438(G)     1.1339(G)     1.1700(G)     1.2278(G) speedups     0.8246(G)     0.8644(d)     0.8173(G)     0.9019(d) 
664 [  isle  times   uint32] : auto speedups     1.1238(G)     1.1609(G)     1.1258(G)     1.2269(G) speedups     0.8205(G)     0.8656(d)     0.8110(G)     0.8955(d) 
665 [  isle  times    int64] : auto speedups     1.0612(G)     1.0776(G)     1.0459(G)     1.1228(G) speedups     0.8121(G)     0.8768(d)     0.8001(G)     0.9036(d) 
666 [  isle  times   uint64] : auto speedups     1.0325(G)     1.0438(G)     1.0407(G)     1.1129(G) speedups     0.8247(G)     0.7864(d)     0.8037(G)     0.8362(d) 
667 [  isle  times   single] : auto speedups     1.1641(G)     1.1761(G)     1.1723(G)     1.2549(G) speedups     0.8951(G)     1.0022(d)     0.8759(G)     1.0585(d) 
668 [  isle  times   double] : auto speedups     1.1085(G)     1.0910(G)     1.0932(G)     1.1760(G) speedups     0.9125(G)     2.5242(d)     0.8891(G)     2.5788(d) 
669 [  isle     or  logical] : auto speedups     1.1108(G)     1.1499(G)     1.1251(G)     1.2020(G) speedups     0.9037(G)     0.9866(d)     0.8839(G)     1.0616(d) 
670 [  isle    and  logical] : auto speedups     1.0265(G)     1.0507(G)     1.0596(G)     1.1195(G) speedups     0.8961(G)     0.9936(d)     0.8707(G)     1.0625(d) 
671 [  isle    xor  logical] : auto speedups     1.0503(G)     1.0620(G)     1.0487(G)     1.1322(G) speedups     0.8957(G)     0.9998(d)     0.8555(G)     1.0586(d) 
672 [  isle     eq  logical] : auto speedups     1.0710(G)     1.0741(G)     1.0769(G)     1.1282(G) speedups     0.8914(G)     0.9969(d)     0.8595(G)     1.0387(d) 
673 [    or    min  logical] : auto speedups     1.0521(G)     1.0692(G)     1.0502(G)     1.1228(G) speedups     0.8534(G)     0.9888(d)     0.8309(G)     1.0282(d) 
674 [    or    min     int8] : auto speedups     1.0377(G)     1.0533(G)     1.0526(G)     1.1087(G) speedups     0.8805(G)     0.8163(d)     0.8727(G)     0.8604(d) 
675 [    or    min    uint8] : auto speedups     1.0414(G)     1.0528(G)     1.0385(G)     1.1158(G) speedups     0.8789(G)     0.8044(d)     0.8544(G)     0.8549(d) 
676 [    or    min    int16] : auto speedups     0.8675(G)     0.8808(G)     0.8630(G)     0.9234(G) speedups     0.8958(G)     0.8281(d)     0.8718(G)     0.8701(d) 
677 [    or    min   uint16] : auto speedups     0.8647(G)     0.8806(G)     0.8610(G)     0.9231(G) speedups     0.8826(G)     0.8271(d)     0.8698(G)     0.8716(d) 
678 [    or    min    int32] : auto speedups     1.0495(G)     1.0587(G)     1.0497(G)     1.1235(G) speedups     0.9003(G)     0.8180(d)     0.8855(G)     0.8728(d) 
679 [    or    min   uint32] : auto speedups     1.0378(G)     1.0485(G)     1.0439(G)     1.1266(G) speedups     0.9113(G)     0.8035(d)     0.8921(G)     0.8385(d) 
680 [    or    min    int64] : auto speedups     0.9626(G)     0.9267(G)     0.8564(G)     0.9670(G) speedups     0.8894(G)     0.7879(d)     0.9102(G)     0.8470(d) 
681 [    or    min   uint64] : auto speedups     0.8802(G)     0.9577(G)     0.9494(G)     1.0098(G) speedups     0.9355(G)     0.7517(d)     0.9024(G)     0.7683(d) 
682 [    or    min   single] : auto speedups     0.7368(G)     0.7620(G)     0.7509(G)     0.7999(G) speedups     0.8335(G)     0.8361(d)     0.8147(G)     0.9839(d) 
683 [    or    min   double] : auto speedups     0.6956(G)     0.6745(G)     0.6716(G)     0.7244(G) speedups     0.8228(G)     2.1752(d)     0.8298(G)     2.0206(d) 
684 [    or    max  logical] : auto speedups     1.0973(G)     1.0987(G)     1.1192(G)     1.1673(G) speedups     0.8471(G)     0.9557(d)     0.7863(G)     1.0302(d) 
685 [    or    max     int8] : auto speedups     0.9392(G)     0.9467(G)     0.9358(G)     0.9818(G) speedups     0.9268(G)     0.7878(d)     0.8434(G)     0.7577(d) 
686 [    or    max    uint8] : auto speedups     0.8820(G)     0.9077(G)     0.9337(G)     0.9842(G) speedups     0.8887(G)     0.7793(d)     0.6868(G)     0.4177(d) 
687 [    or    max    int16] : auto speedups     0.7562(G)     0.8700(G)     0.8631(G)     0.5799(G) speedups     0.7162(G)     0.7737(d)     0.7291(G)     0.5585(d) 
688 [    or    max   uint16] : auto speedups     0.5977(G)     0.3895(G)     0.5432(G)     0.6521(G) speedups     0.6409(G)     0.5120(d)     0.4577(G)     0.6000(d) 
689 [    or    max    int32] : auto speedups     0.5135(G)     0.6478(G)     0.7408(G)     1.1082(G) speedups     0.9802(G)     0.8290(d)     0.9562(G)     0.8725(d) 
690 [    or    max   uint32] : auto speedups     1.0097(G)     1.0249(G)     1.0156(G)     1.0751(G) speedups     0.9437(G)     0.8074(d)     0.9370(G)     0.8410(d) 
691 [    or    max    int64] : auto speedups     0.9592(G)     0.9583(G)     0.9496(G)     1.0272(G) speedups     0.8045(G)     0.8406(d)     0.7839(G)     0.8703(d) 
692 [    or    max   uint64] : auto speedups     0.9237(G)     0.9395(G)     0.9052(G)     0.9626(G) speedups     0.8968(G)     0.7518(d)     0.8737(G)     0.7705(d) 
693 [    or    max   single] : auto speedups     0.7650(G)     0.7768(G)     0.7742(G)     0.8239(G) speedups     0.8541(G)     0.9667(d)     0.8380(G)     1.0083(d) 
694 [    or    max   double] : auto speedups     0.7214(G)     0.7408(G)     0.7452(G)     0.7835(G) speedups     0.9573(G)     2.1891(d)     0.9354(G)     2.2675(d) 
695 [    or   plus  logical] : auto speedups     1.1670(G)     1.2163(G)     1.1682(G)     1.2745(G) speedups     0.8860(G)     0.9941(d)     0.8667(G)     1.0538(d) 
696 [    or   plus     int8] : auto speedups     1.0549(G)     1.1008(G)     1.0670(G)     1.1653(G) speedups     0.8548(G)     0.8366(d)     0.8488(G)     0.8829(d) 
697 [    or   plus    uint8] : auto speedups     1.0827(G)     1.0932(G)     1.0983(G)     1.1444(G) speedups     0.8544(G)     0.8420(d)     0.8521(G)     0.8814(d) 
698 [    or   plus    int16] : auto speedups     1.0590(G)     1.0794(G)     1.0489(G)     1.1343(G) speedups     0.7886(G)     0.8343(d)     0.9186(G)     0.8222(d) 
699 [    or   plus   uint16] : auto speedups     0.7578(G)     1.0479(G)     1.0324(G)     1.1266(G) speedups     0.9106(G)     0.8374(d)     0.8865(G)     0.8068(d) 
700 [    or   plus    int32] : auto speedups     1.1243(G)     1.1751(G)     1.1169(G)     0.9437(G) speedups     0.6136(G)     0.7632(d)     0.8056(G)     0.8622(d) 
701 [    or   plus   uint32] : auto speedups     1.1126(G)     1.1383(G)     1.1404(G)     1.1810(G) speedups     0.9645(G)     0.7927(d)     0.6982(G)     0.7203(d) 
702 [    or   plus    int64] : auto speedups     1.0126(G)     1.0269(G)     1.0159(G)     1.0021(G) speedups     0.9271(G)     0.7929(d)     0.9191(G)     0.7974(d) 
703 [    or   plus   uint64] : auto speedups     0.9358(G)     0.9331(G)     0.9200(G)     1.0197(G) speedups     0.8223(G)     0.7154(d)     0.8031(G)     0.7615(d) 
704 [    or   plus   single] : auto speedups     0.8395(G)     0.8532(G)     0.8206(G)     0.8895(G) speedups     0.8947(G)     0.9694(d)     0.9545(G)     1.0050(d) 
705 [    or   plus   double] : auto speedups     0.7728(G)     0.8198(G)     0.7914(G)     0.8306(G) speedups     0.8256(G)     1.9622(d)     0.8082(G)     2.1100(d) 
706 [    or  times  logical] : auto speedups     0.9658(G)     0.9838(G)     1.0094(G)     1.0518(G) speedups     0.8020(G)     0.9169(d)     0.7870(G)     0.9227(d) 
707 [    or  times     int8] : auto speedups     1.1296(G)     1.1882(G)     1.1648(G)     1.2579(G) speedups     0.8792(G)     0.7357(d)     0.8635(G)     0.8370(d) 
708 [    or  times    uint8] : auto speedups     1.1080(G)     1.1509(G)     1.1618(G)     1.2121(G) speedups     0.9016(G)     0.7746(d)     0.9294(G)     0.8051(d) 
709 [    or  times    int16] : auto speedups     0.8798(G)     0.8955(G)     0.8766(G)     0.9400(G) speedups     0.8700(G)     0.7593(d)     0.8784(G)     0.7769(d) 
710 [    or  times   uint16] : auto speedups     0.9412(G)     0.9668(G)     0.9608(G)     1.0065(G) speedups     0.9300(G)     0.8392(d)     0.9248(G)     0.8631(d) 
711 [    or  times    int32] : auto speedups     1.1263(G)     1.1303(G)     1.1116(G)     1.1911(G) speedups     0.8785(G)     0.8095(d)     0.8828(G)     0.8693(d) 
712 [    or  times   uint32] : auto speedups     1.0019(G)     1.1045(G)     1.0784(G)     1.1302(G) speedups     0.8582(G)     0.7481(d)     0.8357(G)     0.7870(d) 
713 [    or  times    int64] : auto speedups     0.9339(G)     0.9992(G)     0.9366(G)     1.0324(G) speedups     0.7227(G)     0.8066(d)     0.7717(G)     0.8188(d) 
714 [    or  times   uint64] : auto speedups     0.9499(G)     0.9786(G)     0.9790(G)     1.0040(G) speedups     0.7945(G)     0.7170(d)     0.7911(G)     0.7539(d) 
715 [    or  times   single] : auto speedups     0.8379(G)     0.8545(G)     0.8323(G)     0.9194(G) speedups     0.7870(G)     0.9772(d)     0.7751(G)     1.0111(d) 
716 [    or  times   double] : auto speedups     0.8110(G)     0.7913(G)     0.7794(G)     0.8724(G) speedups     0.8438(G)     2.0161(d)     0.8254(G)     2.0991(d) 
717 [    or     or  logical] : auto speedups     1.0351(G)     1.1347(G)     1.1148(G)     1.2046(G) speedups     0.8427(G)     0.9326(d)     0.8457(G)     0.9973(d) 
718 [    or    and  logical] : auto speedups     1.0198(G)     0.9440(G)     1.0034(G)     1.0928(G) speedups     0.8624(G)     0.9079(d)     0.8335(G)     1.0228(d) 
719 [    or    xor  logical] : auto speedups     1.1044(G)     1.1331(G)     1.1321(G)     1.2072(G) speedups     0.8445(G)     0.9632(d)     0.8399(G)     1.0260(d) 
720 [    or     eq  logical] : auto speedups     0.9418(G)     0.9439(G)     0.9521(G)     1.0270(G) speedups     0.9160(G)     0.9379(d)     0.9170(G)     1.0055(d) 
721 [   and    min  logical] : auto speedups     1.0221(G)     1.0225(G)     0.9852(G)     1.0614(G) speedups     0.9587(G)     0.9403(d)     0.9566(G)     1.0129(d) 
722 [   and    min     int8] : auto speedups     0.9928(G)     1.0183(G)     1.0142(G)     1.0861(G) speedups     0.9732(G)     0.7601(d)     0.9621(G)     0.8643(d) 
723 [   and    min    uint8] : auto speedups     1.0225(G)     1.0270(G)     1.0307(G)     1.1027(G) speedups     0.8460(G)     0.8384(d)     0.7829(G)     0.8419(d) 
724 [   and    min    int16] : auto speedups     0.8663(G)     0.8933(G)     0.8977(G)     0.9518(G) speedups     0.8688(G)     0.8142(d)     0.8375(G)     0.7792(d) 
725 [   and    min   uint16] : auto speedups     0.8466(G)     0.8808(G)     0.8622(G)     0.9320(G) speedups     0.8956(G)     0.7568(d)     0.8757(G)     0.8249(d) 
726 [   and    min    int32] : auto speedups     0.7547(G)     0.7468(G)     0.7810(G)     0.8138(G) speedups     0.8790(G)     0.7912(d)     0.9048(G)     0.8436(d) 
727 [   and    min   uint32] : auto speedups     0.8012(G)     0.7768(G)     0.7737(G)     0.8384(G) speedups     0.9085(G)     0.7553(d)     0.8768(G)     0.8421(d) 
728 [   and    min    int64] : auto speedups     0.8724(G)     0.8860(G)     0.8280(G)     0.9035(G) speedups     0.7473(G)     0.7621(d)     0.7254(G)     0.8370(d) 
729 [   and    min   uint64] : auto speedups     0.8913(G)     0.8940(G)     0.8697(G)     0.9507(G) speedups     0.9621(G)     0.7716(d)     0.9448(G)     0.8024(d) 
730 [   and    min   single] : auto speedups     0.7312(G)     0.7357(G)     0.7469(G)     0.7608(G) speedups     0.8352(G)     0.9463(d)     0.8421(G)     0.9696(d) 
731 [   and    min   double] : auto speedups     0.7148(G)     0.7391(G)     0.6921(G)     0.7166(G) speedups     0.8490(G)     1.9633(d)     0.8336(G)     1.9690(d) 
732 [   and    max  logical] : auto speedups     1.0176(G)     1.0309(G)     0.9959(G)     0.9820(G) speedups     0.9206(G)     0.7225(d)     0.9444(G)     1.0280(d) 
733 [   and    max     int8] : auto speedups     0.9808(G)     0.9232(G)     0.9796(G)     1.0307(G) speedups     0.8986(G)     0.7897(d)     0.8668(G)     0.8222(d) 
734 [   and    max    uint8] : auto speedups     0.9923(G)     1.0049(G)     0.9772(G)     1.0180(G) speedups     0.9946(G)     0.7821(d)     0.9678(G)     0.8521(d) 
735 [   and    max    int16] : auto speedups     0.8638(G)     0.9030(G)     0.8669(G)     0.9331(G) speedups     0.9575(G)     0.7907(d)     0.8981(G)     0.8117(d) 
736 [   and    max   uint16] : auto speedups     0.8534(G)     0.8725(G)     0.8175(G)     0.9198(G) speedups     0.9428(G)     0.8009(d)     0.8928(G)     0.8346(d) 
737 [   and    max    int32] : auto speedups     0.7593(G)     0.7941(G)     0.8055(G)     0.8709(G) speedups     0.9662(G)     0.8461(d)     0.9283(G)     0.8653(d) 
738 [   and    max   uint32] : auto speedups     0.7257(G)     0.7507(G)     0.7366(G)     0.7740(G) speedups     0.8858(G)     0.7950(d)     0.8800(G)     0.8583(d) 
739 [   and    max    int64] : auto speedups     0.8867(G)     0.8712(G)     0.8692(G)     0.9596(G) speedups     0.8881(G)     0.7591(d)     0.9093(G)     0.8138(d) 
740 [   and    max   uint64] : auto speedups     0.8107(G)     0.8163(G)     0.8474(G)     0.8496(G) speedups     0.8664(G)     0.6138(d)     0.8112(G)     0.7880(d) 
741 [   and    max   single] : auto speedups     0.7100(G)     0.7033(G)     0.7261(G)     0.7795(G) speedups     0.8137(G)     0.8729(d)     0.8231(G)     0.9030(d) 
742 [   and    max   double] : auto speedups     0.6911(G)     0.6487(G)     0.6486(G)     0.7839(G) speedups     0.8684(G)     1.7360(d)     0.7783(G)     1.6958(d) 
743 [   and   plus  logical] : auto speedups     0.8164(G)     0.7166(G)     0.7279(G)     0.7768(G) speedups     0.8769(G)     0.7405(d)     0.6111(G)     0.9132(d) 
744 [   and   plus     int8] : auto speedups     0.6611(G)     1.0149(G)     0.9674(G)     1.0149(G) speedups     0.8563(G)     0.7698(d)     0.7444(G)     0.7733(d) 
745 [   and   plus    uint8] : auto speedups     0.9664(G)     1.0575(G)     1.0732(G)     1.1332(G) speedups     0.8182(G)     0.7947(d)     0.7625(G)     0.7874(d) 
746 [   and   plus    int16] : auto speedups     1.0567(G)     1.0408(G)     1.0467(G)     1.1322(G) speedups     0.9982(G)     0.8492(d)     0.9464(G)     0.7935(d) 
747 [   and   plus   uint16] : auto speedups     0.9878(G)     1.0177(G)     1.0338(G)     1.1207(G) speedups     0.8572(G)     0.7774(d)     0.8579(G)     0.7360(d) 
748 [   and   plus    int32] : auto speedups     0.8810(G)     0.9768(G)     0.9311(G)     1.0830(G) speedups     0.6941(G)     0.6727(d)     0.7075(G)     0.7107(d) 
749 [   and   plus   uint32] : auto speedups     0.8707(G)     1.0284(G)     1.0342(G)     1.1284(G) speedups     0.8801(G)     0.7995(d)     0.8599(G)     0.8497(d) 
750 [   and   plus    int64] : auto speedups     0.9547(G)     0.9818(G)     0.9071(G)     0.7042(G) speedups     0.7051(G)     0.7815(d)     0.9496(G)     0.8422(d) 
751 [   and   plus   uint64] : auto speedups     0.9346(G)     0.8554(G)     0.9571(G)     1.0239(G) speedups     0.8047(G)     0.7733(d)     0.8124(G)     0.8454(d) 
752 [   and   plus   single] : auto speedups     0.9061(G)     0.9102(G)     0.8899(G)     0.9568(G) speedups     0.8001(G)     0.9744(d)     0.8008(G)     1.0477(d) 
753 [   and   plus   double] : auto speedups     0.8534(G)     0.8360(G)     0.8554(G)     0.9101(G) speedups     0.7975(G)     2.0945(d)     0.7947(G)     2.1667(d) 
754 [   and  times  logical] : auto speedups     1.0352(G)     1.0383(G)     1.0343(G)     1.1069(G) speedups     0.9920(G)     0.9882(d)     0.9637(G)     1.0457(d) 
755 [   and  times     int8] : auto speedups     1.3878(G)     1.3765(G)     1.3950(G)     1.4897(G) speedups     1.0575(G)     0.8493(d)     1.0266(G)     0.8911(d) 
756 [   and  times    uint8] : auto speedups     1.4070(G)     1.4005(G)     1.3873(G)     1.4898(G) speedups     1.0579(G)     0.8049(d)     1.0320(G)     0.8880(d) 
757 [   and  times    int16] : auto speedups     1.3935(G)     1.3932(G)     1.3406(G)     1.4646(G) speedups     1.0591(G)     0.8345(d)     1.0490(G)     0.8806(d) 
758 [   and  times   uint16] : auto speedups     1.3456(G)     1.3921(G)     1.3708(G)     1.4611(G) speedups     1.0477(G)     0.8119(d)     0.9821(G)     0.8613(d) 
759 [   and  times    int32] : auto speedups     1.3872(G)     1.3601(G)     1.3540(G)     1.4196(G) speedups     1.0384(G)     0.7457(d)     0.9972(G)     0.8287(d) 
760 [   and  times   uint32] : auto speedups     1.2686(G)     1.3384(G)     1.3231(G)     1.4143(G) speedups     0.7470(G)     0.4504(d)     0.4982(G)     0.5081(d) 
761 [   and  times    int64] : auto speedups     0.4049(G)     0.3254(G)     0.4099(G)     0.4364(G) speedups     0.5061(G)     0.4393(d)     0.4687(G)     0.4229(d) 
762 [   and  times   uint64] : auto speedups     0.4838(G)     0.4464(G)     0.4861(G)     0.5260(G) speedups     0.5046(G)     0.4194(d)     0.4692(G)     0.4283(d) 
763 [   and  times   single] : auto speedups     0.3369(G)     0.2950(G)     0.3096(G)     0.3704(G) speedups     0.4063(G)     0.4688(d)     0.4110(G)     0.4506(d) 
764 [   and  times   double] : auto speedups     0.2987(G)     0.2863(G)     0.3040(G)     0.3223(G) speedups     0.3833(G)     0.7474(d)     0.3829(G)     0.7777(d) 
765 [   and     or  logical] : auto speedups     0.3813(G)     0.5081(G)     0.8955(G)     0.9176(G) speedups     0.9024(G)     0.9191(d)     0.9590(G)     0.9866(d) 
766 [   and    and  logical] : auto speedups     0.9091(G)     0.9943(G)     0.9912(G)     1.0290(G) speedups     0.9165(G)     0.8402(d)     0.8244(G)     0.8337(d) 
767 [   and    xor  logical] : auto speedups     0.8900(G)     1.0046(G)     1.0442(G)     1.0360(G) speedups     0.9604(G)     0.9650(d)     0.9532(G)     1.0228(d) 
768 [   and     eq  logical] : auto speedups     1.0987(G)     0.9775(G)     0.9355(G)     1.1577(G) speedups     1.0712(G)     0.9817(d)     1.0562(G)     0.9953(d) 
769 [   xor    min  logical] : auto speedups     0.9557(G)     1.0768(G)     1.0265(G)     0.8728(G) speedups     0.8874(G)     0.9243(d)     0.8934(G)     1.0518(d) 
770 [   xor    min     int8] : auto speedups     1.0299(G)     1.0278(G)     1.0060(G)     1.0995(G) speedups     0.8924(G)     0.7949(d)     0.8498(G)     0.8540(d) 
771 [   xor    min    uint8] : auto speedups     0.9744(G)     0.9312(G)     0.8931(G)     0.9896(G) speedups     0.9135(G)     0.7996(d)     0.9006(G)     0.8498(d) 
772 [   xor    min    int16] : auto speedups     0.9163(G)     0.9501(G)     0.9379(G)     0.9983(G) speedups     1.0761(G)     0.8318(d)     1.0132(G)     0.8831(d) 
773 [   xor    min   uint16] : auto speedups     0.9295(G)     0.9402(G)     0.9267(G)     0.9478(G) speedups     0.9660(G)     0.8401(d)     0.9343(G)     0.8989(d) 
774 [   xor    min    int32] : auto speedups     0.6853(G)     0.7789(G)     0.7943(G)     0.8639(G) speedups     0.9561(G)     0.8402(d)     0.9265(G)     0.8734(d) 
775 [   xor    min   uint32] : auto speedups     0.8246(G)     0.8072(G)     0.8214(G)     0.8754(G) speedups     0.9551(G)     0.8376(d)     0.9120(G)     0.8863(d) 
776 [   xor    min    int64] : auto speedups     0.8697(G)     0.8335(G)     0.8170(G)     0.8605(G) speedups     0.8419(G)     0.8230(d)     0.8672(G)     0.8826(d) 
777 [   xor    min   uint64] : auto speedups     0.9075(G)     0.9085(G)     0.8951(G)     0.9816(G) speedups     0.9306(G)     0.7742(d)     0.9168(G)     0.8129(d) 
778 [   xor    min   single] : auto speedups     0.7598(G)     0.7719(G)     0.7693(G)     0.8214(G) speedups     0.9216(G)     0.9636(d)     0.7445(G)     0.7598(d) 
779 [   xor    min   double] : auto speedups     0.5389(G)     0.4792(G)     0.6375(G)     0.7382(G) speedups     0.8487(G)     1.9164(d)     0.8156(G)     2.0373(d) 
780 [   xor    max  logical] : auto speedups     1.0671(G)     1.1282(G)     1.1303(G)     1.2132(G) speedups     0.8511(G)     0.9635(d)     0.8333(G)     0.9954(d) 
781 [   xor    max     int8] : auto speedups     0.9214(G)     0.9235(G)     0.9158(G)     0.9926(G) speedups     0.8797(G)     0.7955(d)     0.8693(G)     0.8586(d) 
782 [   xor    max    uint8] : auto speedups     0.9115(G)     0.9433(G)     0.9207(G)     0.9928(G) speedups     0.9330(G)     0.7614(d)     0.9076(G)     0.8149(d) 
783 [   xor    max    int16] : auto speedups     0.9028(G)     0.9049(G)     0.8911(G)     0.9645(G) speedups     0.9060(G)     0.8159(d)     0.9032(G)     0.8546(d) 
784 [   xor    max   uint16] : auto speedups     0.8419(G)     0.8621(G)     0.8537(G)     0.9001(G) speedups     0.9143(G)     0.8039(d)     0.9166(G)     0.8522(d) 
785 [   xor    max    int32] : auto speedups     0.7828(G)     0.7930(G)     0.8008(G)     0.8373(G) speedups     0.7825(G)     0.7937(d)     0.7672(G)     0.8661(d) 
786 [   xor    max   uint32] : auto speedups     0.7099(G)     0.7217(G)     0.7208(G)     0.7754(G) speedups     0.9308(G)     0.8039(d)     0.9041(G)     0.8477(d) 
787 [   xor    max    int64] : auto speedups     0.8615(G)     0.9034(G)     0.8796(G)     0.9322(G) speedups     0.8636(G)     0.8233(d)     0.8442(G)     0.8447(d) 
788 [   xor    max   uint64] : auto speedups     0.8381(G)     0.8543(G)     0.8471(G)     0.8978(G) speedups     0.8857(G)     0.7599(d)     0.8670(G)     0.7981(d) 
789 [   xor    max   single] : auto speedups     0.7300(G)     0.7344(G)     0.7334(G)     0.7893(G) speedups     0.8568(G)     0.8993(d)     0.8478(G)     0.9750(d) 
790 [   xor    max   double] : auto speedups     0.6972(G)     0.7100(G)     0.6971(G)     0.7446(G) speedups     0.8560(G)     1.9233(d)     0.8162(G)     2.0325(d) 
791 [   xor   plus  logical] : auto speedups     1.1146(G)     1.1550(G)     1.1185(G)     1.2192(G) speedups     0.8495(G)     0.9653(d)     0.8407(G)     1.0187(d) 
792 [   xor   plus     int8] : auto speedups     1.0696(G)     1.1069(G)     1.0751(G)     1.1453(G) speedups     0.8409(G)     0.8190(d)     0.8210(G)     0.8657(d) 
793 [   xor   plus    uint8] : auto speedups     1.0825(G)     1.0782(G)     1.0737(G)     1.1598(G) speedups     0.8321(G)     0.8257(d)     0.8140(G)     0.8793(d) 
794 [   xor   plus    int16] : auto speedups     1.0316(G)     1.0487(G)     1.0510(G)     1.0929(G) speedups     0.9352(G)     0.8258(d)     0.9036(G)     0.8476(d) 
795 [   xor   plus   uint16] : auto speedups     1.0483(G)     1.0607(G)     0.9848(G)     1.1009(G) speedups     0.9145(G)     0.7705(d)     0.7496(G)     0.7542(d) 
796 [   xor   plus    int32] : auto speedups     0.9366(G)     0.9036(G)     0.9246(G)     0.9570(G) speedups     0.7756(G)     0.7688(d)     0.8051(G)     0.8879(d) 
797 [   xor   plus   uint32] : auto speedups     0.9365(G)     1.0631(G)     1.0380(G)     0.9469(G) speedups     0.6806(G)     0.7471(d)     0.7744(G)     0.8415(d) 
798 [   xor   plus    int64] : auto speedups     0.9011(G)     0.9628(G)     0.9741(G)     1.0334(G) speedups     0.9005(G)     0.8245(d)     0.8995(G)     0.8755(d) 
799 [   xor   plus   uint64] : auto speedups     0.9458(G)     0.9616(G)     0.9681(G)     1.0532(G) speedups     0.7971(G)     0.7640(d)     0.7799(G)     0.7981(d) 
800 [   xor   plus   single] : auto speedups     0.8648(G)     0.8720(G)     0.8036(G)     0.8766(G) speedups     0.8682(G)     0.9549(d)     0.8854(G)     0.9693(d) 
801 [   xor   plus   double] : auto speedups     0.8382(G)     0.8279(G)     0.8627(G)     0.8883(G) speedups     0.9243(G)     1.9875(d)     0.8589(G)     2.0132(d) 
802 [   xor  times  logical] : auto speedups     1.0529(G)     1.0675(G)     1.0588(G)     1.1779(G) speedups     0.9492(G)     0.9937(d)     0.9216(G)     1.0600(d) 
803 [   xor  times     int8] : auto speedups     1.0979(G)     1.0955(G)     1.1400(G)     1.2302(G) speedups     0.9647(G)     0.8569(d)     0.9503(G)     0.9062(d) 
804 [   xor  times    uint8] : auto speedups     1.1299(G)     1.1417(G)     1.1418(G)     1.2118(G) speedups     0.9087(G)     0.8521(d)     0.9471(G)     0.9046(d) 
805 [   xor  times    int16] : auto speedups     0.9407(G)     0.9381(G)     0.9637(G)     1.0149(G) speedups     0.9192(G)     0.8555(d)     0.8952(G)     0.8974(d) 
806 [   xor  times   uint16] : auto speedups     0.9377(G)     0.9462(G)     0.9451(G)     0.9995(G) speedups     0.9974(G)     0.8604(d)     0.9679(G)     0.9065(d) 
807 [   xor  times    int32] : auto speedups     0.9336(G)     0.9605(G)     0.9365(G)     1.0216(G) speedups     0.9177(G)     0.8405(d)     0.9074(G)     0.8730(d) 
808 [   xor  times   uint32] : auto speedups     0.9293(G)     0.9650(G)     0.9403(G)     1.0137(G) speedups     0.8633(G)     0.8651(d)     0.8426(G)     0.9001(d) 
809 [   xor  times    int64] : auto speedups     0.9491(G)     0.9696(G)     0.9404(G)     1.0143(G) speedups     0.8488(G)     0.7920(d)     0.7557(G)     0.8950(d) 
810 [   xor  times   uint64] : auto speedups     0.9102(G)     0.9192(G)     0.9412(G)     0.9822(G) speedups     0.8800(G)     0.7564(d)     0.8699(G)     0.6375(d) 
811 [   xor  times   single] : auto speedups     0.8450(G)     0.8959(G)     0.8822(G)     0.9432(G) speedups     0.8239(G)     0.9301(d)     0.9112(G)     1.0026(d) 
812 [   xor  times   double] : auto speedups     0.7957(G)     0.8251(G)     0.8132(G)     0.8797(G) speedups     0.8917(G)     1.8749(d)     0.8821(G)     2.0326(d) 
813 [   xor     or  logical] : auto speedups     1.1716(G)     1.1901(G)     1.1914(G)     1.2502(G) speedups     0.8729(G)     0.9878(d)     0.8420(G)     1.0392(d) 
814 [   xor    and  logical] : auto speedups     1.0181(G)     1.0494(G)     1.0598(G)     1.1476(G) speedups     0.9530(G)     1.0117(d)     0.9414(G)     1.0216(d) 
815 [   xor    xor  logical] : auto speedups     1.0524(G)     1.1031(G)     1.0981(G)     1.1819(G) speedups     0.8262(G)     1.0000(d)     0.8068(G)     1.0237(d) 
816 [   xor     eq  logical] : auto speedups     1.0677(G)     1.0064(G)     0.9518(G)     1.1596(G) speedups     0.9702(G)     0.9816(d)     0.9622(G)     1.0564(d) 
817 [    eq    min  logical] : auto speedups     1.0758(G)     1.1120(G)     1.0917(G)     1.1408(G) speedups     0.9625(G)     0.9773(d)     0.9204(G)     1.0416(d) 
818 [    eq    min     int8] : auto speedups     1.0376(G)     1.0702(G)     1.0614(G)     1.1380(G) speedups     0.8962(G)     0.7679(d)     0.8523(G)     0.8377(d) 
819 [    eq    min    uint8] : auto speedups     1.0325(G)     1.0295(G)     1.0561(G)     1.1135(G) speedups     0.8321(G)     0.8192(d)     0.8268(G)     0.8580(d) 
820 [    eq    min    int16] : auto speedups     1.0550(G)     1.0751(G)     1.0836(G)     1.1383(G) speedups     0.9355(G)     0.8417(d)     0.9165(G)     0.8806(d) 
821 [    eq    min   uint16] : auto speedups     1.0422(G)     1.0646(G)     1.0676(G)     1.1541(G) speedups     0.8481(G)     0.8393(d)     0.8368(G)     0.8606(d) 
822 [    eq    min    int32] : auto speedups     1.0552(G)     1.0860(G)     1.0744(G)     1.1527(G) speedups     0.9070(G)     0.8310(d)     0.8777(G)     0.8582(d) 
823 [    eq    min   uint32] : auto speedups     1.0485(G)     1.0901(G)     1.0731(G)     1.1226(G) speedups     0.9425(G)     0.8454(d)     0.9090(G)     0.8713(d) 
824 [    eq    min    int64] : auto speedups     1.0333(G)     1.0361(G)     1.0231(G)     1.1027(G) speedups     0.9308(G)     0.8412(d)     0.8917(G)     0.8806(d) 
825 [    eq    min   uint64] : auto speedups     1.0275(G)     1.0274(G)     1.0332(G)     1.1048(G) speedups     0.8730(G)     0.7980(d)     0.8706(G)     0.8250(d) 
826 [    eq    min   single] : auto speedups     1.0055(G)     0.9452(G)     0.8816(G)     1.0127(G) speedups     0.9014(G)     0.9918(d)     0.9135(G)     1.0503(d) 
827 [    eq    min   double] : auto speedups     0.9581(G)     0.9754(G)     0.9609(G)     1.0293(G) speedups     0.9185(G)     2.3980(d)     0.9025(G)     2.4542(d) 
828 [    eq    max  logical] : auto speedups     1.1553(G)     1.1826(G)     1.1634(G)     1.2706(G) speedups     0.9179(G)     1.0079(d)     0.8963(G)     1.0634(d) 
829 [    eq    max     int8] : auto speedups     1.1171(G)     1.1648(G)     1.1498(G)     1.2172(G) speedups     0.8363(G)     0.8443(d)     0.8236(G)     0.8916(d) 
830 [    eq    max    uint8] : auto speedups     1.1463(G)     1.1453(G)     1.1493(G)     1.2340(G) speedups     0.9163(G)     0.8542(d)     0.8853(G)     0.9058(d) 
831 [    eq    max    int16] : auto speedups     1.1439(G)     1.1593(G)     1.1434(G)     1.2379(G) speedups     0.8167(G)     0.8487(d)     0.8212(G)     0.9021(d) 
832 [    eq    max   uint16] : auto speedups     1.1341(G)     1.1689(G)     1.1440(G)     1.2170(G) speedups     0.9520(G)     0.8600(d)     0.9151(G)     0.9083(d) 
833 [    eq    max    int32] : auto speedups     1.1305(G)     1.1384(G)     1.1469(G)     1.2120(G) speedups     0.9138(G)     0.8599(d)     0.8684(G)     0.9007(d) 
834 [    eq    max   uint32] : auto speedups     1.1683(G)     1.1695(G)     1.1622(G)     1.2379(G) speedups     0.9403(G)     0.8468(d)     0.9125(G)     0.9082(d) 
835 [    eq    max    int64] : auto speedups     1.0885(G)     1.1105(G)     1.0913(G)     1.1864(G) speedups     0.8693(G)     0.8635(d)     0.8687(G)     0.9166(d) 
836 [    eq    max   uint64] : auto speedups     1.1135(G)     1.1373(G)     1.1238(G)     1.1869(G) speedups     0.9219(G)     0.8244(d)     0.9034(G)     0.8653(d) 
837 [    eq    max   single] : auto speedups     0.9902(G)     1.0168(G)     0.9949(G)     1.0541(G) speedups     0.9524(G)     0.9899(d)     0.9318(G)     1.0187(d) 
838 [    eq    max   double] : auto speedups     0.9652(G)     0.9568(G)     0.9679(G)     1.0335(G) speedups     0.8738(G)     2.3601(d)     0.9072(G)     2.5082(d) 
839 [    eq   plus  logical] : auto speedups     1.1732(G)     1.1832(G)     1.1627(G)     1.2710(G) speedups     0.9157(G)     1.0085(d)     0.8963(G)     1.0595(d) 
840 [    eq   plus     int8] : auto speedups     1.1467(G)     1.1640(G)     1.1356(G)     1.2313(G) speedups     0.8243(G)     0.8491(d)     0.8204(G)     0.9013(d) 
841 [    eq   plus    uint8] : auto speedups     1.1284(G)     1.1653(G)     1.1392(G)     1.2413(G) speedups     0.9133(G)     0.8508(d)     0.8857(G)     0.8904(d) 
842 [    eq   plus    int16] : auto speedups     1.1238(G)     1.1612(G)     1.1489(G)     1.0569(G) speedups     0.7950(G)     0.7994(d)     0.7802(G)     0.8469(d) 
843 [    eq   plus   uint16] : auto speedups     1.1188(G)     1.1420(G)     1.0772(G)     1.2194(G) speedups     0.9535(G)     0.8478(d)     0.9090(G)     0.8908(d) 
844 [    eq   plus    int32] : auto speedups     1.1584(G)     1.1881(G)     1.1679(G)     1.2640(G) speedups     0.9081(G)     0.8620(d)     0.8878(G)     0.9010(d) 
845 [    eq   plus   uint32] : auto speedups     1.0925(G)     1.1829(G)     1.1618(G)     1.2458(G) speedups     0.9264(G)     0.8579(d)     0.9065(G)     0.9080(d) 
846 [    eq   plus    int64] : auto speedups     1.0785(G)     1.1228(G)     1.0925(G)     1.1769(G) speedups     0.8914(G)     0.8683(d)     0.8617(G)     0.9211(d) 
847 [    eq   plus   uint64] : auto speedups     1.0957(G)     1.1115(G)     1.1196(G)     1.1820(G) speedups     0.9131(G)     0.8099(d)     0.8953(G)     0.8536(d) 
848 [    eq   plus   single] : auto speedups     1.0044(G)     1.0187(G)     0.9904(G)     1.0780(G) speedups     0.9527(G)     0.9687(d)     0.9449(G)     1.0203(d) 
849 [    eq   plus   double] : auto speedups     0.9527(G)     0.9691(G)     0.9436(G)     1.0180(G) speedups     0.9392(G)     2.4690(d)     0.8946(G)     2.4149(d) 
850 [    eq  times  logical] : auto speedups     1.0786(G)     1.0946(G)     1.0936(G)     1.1565(G) speedups     0.9496(G)     0.9850(d)     0.9343(G)     1.0485(d) 
851 [    eq  times     int8] : auto speedups     1.0638(G)     1.0836(G)     1.0823(G)     1.1437(G) speedups     0.9081(G)     0.8307(d)     0.8767(G)     0.8661(d) 
852 [    eq  times    uint8] : auto speedups     1.0729(G)     1.0877(G)     1.0786(G)     1.1559(G) speedups     0.8488(G)     0.8183(d)     0.8324(G)     0.8753(d) 
853 [    eq  times    int16] : auto speedups     1.0479(G)     1.0710(G)     1.0781(G)     1.1567(G) speedups     0.9290(G)     0.8219(d)     0.9109(G)     0.8761(d) 
854 [    eq  times   uint16] : auto speedups     1.0686(G)     1.0825(G)     1.0792(G)     1.1594(G) speedups     0.8402(G)     0.8403(d)     0.8220(G)     0.8628(d) 
855 [    eq  times    int32] : auto speedups     1.0494(G)     1.0664(G)     1.0600(G)     1.1080(G) speedups     0.8911(G)     0.8469(d)     0.8787(G)     0.7773(d) 
856 [    eq  times   uint32] : auto speedups     1.0559(G)     1.0570(G)     1.0743(G)     1.1372(G) speedups     0.9399(G)     0.8467(d)     0.9178(G)     0.8536(d) 
857 [    eq  times    int64] : auto speedups     1.0076(G)     1.0388(G)     1.0243(G)     1.0810(G) speedups     0.9373(G)     0.8374(d)     0.9034(G)     0.8842(d) 
858 [    eq  times   uint64] : auto speedups     1.0321(G)     1.0374(G)     1.0139(G)     1.1087(G) speedups     0.8986(G)     0.7933(d)     0.8675(G)     0.8293(d) 
859 [    eq  times   single] : auto speedups     0.9879(G)     1.0252(G)     1.0099(G)     1.0722(G) speedups     0.9536(G)     1.0018(d)     0.9352(G)     1.0654(d) 
860 [    eq  times   double] : auto speedups     0.9754(G)     0.9722(G)     0.9762(G)     1.0175(G) speedups     0.9387(G)     2.4292(d)     0.8987(G)     2.4912(d) 
861 [    eq     or  logical] : auto speedups     1.1816(G)     1.1902(G)     1.1808(G)     1.2736(G) speedups     0.9159(G)     1.0135(d)     0.8970(G)     1.0576(d) 
862 [    eq     or     int8] : auto speedups     1.1464(G)     1.1765(G)     1.1400(G)     1.2333(G) speedups     0.8430(G)     0.8273(d)     0.8140(G)     0.8902(d) 
863 [    eq     or    uint8] : auto speedups     1.1317(G)     1.1496(G)     1.1376(G)     1.2435(G) speedups     0.9085(G)     0.8631(d)     0.8905(G)     0.8891(d) 
864 [    eq     or    int16] : auto speedups     1.1357(G)     1.1681(G)     1.1324(G)     1.2326(G) speedups     0.8305(G)     0.8485(d)     0.8194(G)     0.9133(d) 
865 [    eq     or   uint16] : auto speedups     1.1471(G)     1.1618(G)     1.1149(G)     1.2257(G) speedups     0.9405(G)     0.8401(d)     0.9287(G)     0.8930(d) 
866 [    eq     or    int32] : auto speedups     1.1218(G)     1.1634(G)     1.1456(G)     1.2386(G) speedups     0.9077(G)     0.8510(d)     0.8727(G)     0.9099(d) 
867 [    eq     or   uint32] : auto speedups     1.1380(G)     1.1624(G)     1.1370(G)     1.2037(G) speedups     0.9421(G)     0.8473(d)     0.9193(G)     0.9002(d) 
868 [    eq     or    int64] : auto speedups     1.1003(G)     1.1129(G)     1.1367(G)     1.1618(G) speedups     0.8864(G)     0.8799(d)     0.8613(G)     0.9125(d) 
869 [    eq     or   uint64] : auto speedups     1.1201(G)     1.1050(G)     1.1148(G)     1.1640(G) speedups     0.8866(G)     0.8317(d)     0.8985(G)     0.8565(d) 
870 [    eq     or   single] : auto speedups     0.9775(G)     1.0078(G)     1.0148(G)     1.0712(G) speedups     0.9619(G)     0.9759(d)     0.9466(G)     1.0325(d) 
871 [    eq     or   double] : auto speedups     0.9635(G)     0.9879(G)     0.9643(G)     1.0338(G) speedups     0.9402(G)     2.4345(d)     0.9263(G)     2.4989(d) 
872 [    eq    and  logical] : auto speedups     1.0777(G)     1.0946(G)     1.0813(G)     1.1574(G) speedups     0.9625(G)     0.9938(d)     0.9361(G)     1.0559(d) 
873 [    eq    and     int8] : auto speedups     1.0689(G)     1.0893(G)     1.0808(G)     1.1576(G) speedups     0.9170(G)     0.8262(d)     0.8857(G)     0.8749(d) 
874 [    eq    and    uint8] : auto speedups     1.0550(G)     1.0895(G)     1.0702(G)     1.1604(G) speedups     0.8650(G)     0.8117(d)     0.8214(G)     0.8703(d) 
875 [    eq    and    int16] : auto speedups     1.0646(G)     1.0804(G)     1.0714(G)     1.1525(G) speedups     0.9453(G)     0.8257(d)     0.9108(G)     0.8863(d) 
876 [    eq    and   uint16] : auto speedups     1.0460(G)     1.0984(G)     1.0731(G)     1.1561(G) speedups     0.8482(G)     0.8170(d)     0.8231(G)     0.8623(d) 
877 [    eq    and    int32] : auto speedups     1.0693(G)     1.0861(G)     1.0805(G)     1.1531(G) speedups     0.9001(G)     0.8296(d)     0.8829(G)     0.8939(d) 
878 [    eq    and   uint32] : auto speedups     1.0648(G)     1.0795(G)     1.0723(G)     1.1400(G) speedups     0.9464(G)     0.8330(d)     0.9219(G)     0.8878(d) 
879 [    eq    and    int64] : auto speedups     1.0213(G)     1.0512(G)     1.0427(G)     1.1300(G) speedups     0.9199(G)     0.8386(d)     0.9031(G)     0.8929(d) 
880 [    eq    and   uint64] : auto speedups     1.0444(G)     1.0490(G)     1.0418(G)     1.0918(G) speedups     0.8888(G)     0.8107(d)     0.8581(G)     0.8443(d) 
881 [    eq    and   single] : auto speedups     1.0028(G)     1.0295(G)     1.0097(G)     1.0784(G) speedups     0.9638(G)     1.0103(d)     0.9483(G)     1.0665(d) 
882 [    eq    and   double] : auto speedups     0.9771(G)     0.9955(G)     0.9807(G)     1.0202(G) speedups     0.9375(G)     2.4217(d)     0.9012(G)     2.4807(d) 
883 [    eq    xor  logical] : auto speedups     1.0837(G)     1.1082(G)     1.1009(G)     1.1324(G) speedups     0.9446(G)     1.0071(d)     0.9083(G)     1.0377(d) 
884 [    eq    xor     int8] : auto speedups     1.0720(G)     1.1045(G)     1.0815(G)     1.1511(G) speedups     0.8795(G)     0.8727(d)     0.8629(G)     0.9043(d) 
885 [    eq    xor    uint8] : auto speedups     1.0467(G)     1.0873(G)     1.0944(G)     1.1700(G) speedups     0.9103(G)     0.8743(d)     0.8940(G)     0.9146(d) 
886 [    eq    xor    int16] : auto speedups     1.0595(G)     1.0690(G)     1.0831(G)     1.1211(G) speedups     0.9863(G)     0.8637(d)     0.9428(G)     0.9084(d) 
887 [    eq    xor   uint16] : auto speedups     1.0701(G)     1.0752(G)     1.0834(G)     1.1416(G) speedups     0.8746(G)     0.8786(d)     0.8504(G)     0.8984(d) 
888 [    eq    xor    int32] : auto speedups     1.0650(G)     1.0816(G)     1.0785(G)     1.1575(G) speedups     0.9638(G)     0.8692(d)     0.9426(G)     0.9236(d) 
889 [    eq    xor   uint32] : auto speedups     1.0573(G)     1.0709(G)     1.0393(G)     1.1347(G) speedups     0.8990(G)     0.8670(d)     0.8776(G)     0.9166(d) 
890 [    eq    xor    int64] : auto speedups     1.0283(G)     1.0304(G)     1.0267(G)     1.0931(G) speedups     0.8884(G)     0.8586(d)     0.8260(G)     0.8843(d) 
891 [    eq    xor   uint64] : auto speedups     0.9488(G)     0.9832(G)     0.9495(G)     0.9521(G) speedups     0.9270(G)     0.6932(d)     0.8674(G)     0.8230(d) 
892 [    eq    xor   single] : auto speedups     1.0377(G)     1.0752(G)     1.0153(G)     1.1464(G) speedups     0.9229(G)     0.9936(d)     0.9074(G)     1.0322(d) 
893 [    eq    xor   double] : auto speedups     1.0513(G)     1.0393(G)     1.0346(G)     1.1217(G) speedups     0.9565(G)     2.4667(d)     0.9315(G)     2.4938(d) 
894 [    eq     eq  logical] : auto speedups     1.0953(G)     1.0890(G)     1.0872(G)     1.1680(G) speedups     0.9448(G)     0.9873(d)     0.9188(G)     1.0372(d) 
895 [    eq     eq     int8] : auto speedups     1.0581(G)     1.1148(G)     1.0751(G)     1.1670(G) speedups     0.9294(G)     0.8526(d)     0.9112(G)     0.9094(d) 
896 [    eq     eq    uint8] : auto speedups     1.0659(G)     1.0966(G)     1.1021(G)     1.1646(G) speedups     0.8908(G)     0.8527(d)     0.8851(G)     0.9028(d) 
897 [    eq     eq    int16] : auto speedups     1.0850(G)     1.1023(G)     1.0608(G)     1.1559(G) speedups     0.9293(G)     0.8638(d)     0.9057(G)     0.9037(d) 
898 [    eq     eq   uint16] : auto speedups     1.0877(G)     1.0988(G)     1.0860(G)     1.1567(G) speedups     0.9590(G)     0.8576(d)     0.8843(G)     0.8850(d) 
899 [    eq     eq    int32] : auto speedups     1.0807(G)     1.0993(G)     1.0627(G)     1.1703(G) speedups     0.9742(G)     0.8655(d)     0.9644(G)     0.9209(d) 
900 [    eq     eq   uint32] : auto speedups     1.0833(G)     1.1096(G)     1.0735(G)     1.1429(G) speedups     0.9101(G)     0.8728(d)     0.8879(G)     0.9322(d) 
901 [    eq     eq    int64] : auto speedups     1.0293(G)     1.0192(G)     1.0507(G)     1.1048(G) speedups     0.9605(G)     0.8640(d)     0.9263(G)     0.9152(d) 
902 [    eq     eq   uint64] : auto speedups     1.0528(G)     1.0490(G)     1.0533(G)     1.1168(G) speedups     0.8810(G)     0.8223(d)     0.8643(G)     0.8729(d) 
903 [    eq     eq   single] : auto speedups     1.0391(G)     1.0577(G)     1.0428(G)     1.1312(G) speedups     0.9075(G)     0.9777(d)     0.8981(G)     1.0266(d) 
904 [    eq     eq   double] : auto speedups     1.0245(G)     0.9622(G)     0.9943(G)     1.0561(G) speedups     0.9018(G)     2.5132(d)     0.8785(G)     2.5797(d) 
905 [    ne    min  logical] : auto speedups     1.0805(G)     1.0921(G)     1.0806(G)     1.1582(G) speedups     0.9627(G)     1.0105(d)     0.9232(G)     1.0802(d) 
906 [    ne    min     int8] : auto speedups     1.0689(G)     1.0920(G)     1.0745(G)     1.1570(G) speedups     0.8364(G)     0.8538(d)     0.8176(G)     0.8951(d) 
907 [    ne    min    uint8] : auto speedups     1.0585(G)     1.0921(G)     1.0830(G)     1.1565(G) speedups     0.8944(G)     0.8415(d)     0.8842(G)     0.8705(d) 
908 [    ne    min    int16] : auto speedups     1.0631(G)     1.0958(G)     1.0614(G)     1.1660(G) speedups     0.8256(G)     0.8591(d)     0.8234(G)     0.9219(d) 
909 [    ne    min   uint16] : auto speedups     1.0688(G)     1.0859(G)     1.0939(G)     1.1251(G) speedups     0.9432(G)     0.8677(d)     0.9147(G)     0.8927(d) 
910 [    ne    min    int32] : auto speedups     1.0731(G)     1.0692(G)     1.0728(G)     1.1319(G) speedups     0.9349(G)     0.8657(d)     0.9248(G)     0.9160(d) 
911 [    ne    min   uint32] : auto speedups     1.0694(G)     1.0884(G)     1.0500(G)     1.1564(G) speedups     0.8934(G)     0.8549(d)     0.8815(G)     0.9187(d) 
912 [    ne    min    int64] : auto speedups     1.0040(G)     1.0346(G)     1.0455(G)     1.0994(G) speedups     0.9012(G)     0.8617(d)     0.8639(G)     0.9037(d) 
913 [    ne    min   uint64] : auto speedups     1.0502(G)     1.0378(G)     1.0246(G)     1.1077(G) speedups     0.9150(G)     0.8121(d)     0.9080(G)     0.8536(d) 
914 [    ne    min   single] : auto speedups     0.9906(G)     1.0158(G)     0.9842(G)     1.0771(G) speedups     0.8399(G)     0.9924(d)     0.8062(G)     1.0271(d) 
915 [    ne    min   double] : auto speedups     0.9717(G)     0.9503(G)     0.9722(G)     1.0376(G) speedups     0.8162(G)     2.3827(d)     0.7919(G)     2.4974(d) 
916 [    ne    max  logical] : auto speedups     1.1812(G)     1.1977(G)     1.1903(G)     1.2653(G) speedups     0.8859(G)     0.9881(d)     0.8632(G)     1.0613(d) 
917 [    ne    max     int8] : auto speedups     1.1035(G)     1.1599(G)     1.0926(G)     1.2314(G) speedups     0.8606(G)     0.8196(d)     0.8312(G)     0.8699(d) 
918 [    ne    max    uint8] : auto speedups     1.1509(G)     1.1535(G)     1.1492(G)     1.2150(G) speedups     0.8355(G)     0.8405(d)     0.8167(G)     0.8636(d) 
919 [    ne    max    int16] : auto speedups     1.1369(G)     1.1455(G)     1.1331(G)     1.2348(G) speedups     0.9254(G)     0.8453(d)     0.9144(G)     0.8882(d) 
920 [    ne    max   uint16] : auto speedups     1.1257(G)     1.1619(G)     1.1356(G)     1.2108(G) speedups     0.9292(G)     0.8500(d)     0.9459(G)     0.8851(d) 
921 [    ne    max    int32] : auto speedups     1.1389(G)     1.1765(G)     1.1712(G)     1.2374(G) speedups     0.9037(G)     0.8432(d)     0.8840(G)     0.8797(d) 
922 [    ne    max   uint32] : auto speedups     1.1403(G)     1.1734(G)     1.1681(G)     1.2442(G) speedups     0.9186(G)     0.8486(d)     0.9138(G)     0.8912(d) 
923 [    ne    max    int64] : auto speedups     1.1111(G)     1.1237(G)     1.1307(G)     1.1859(G) speedups     0.9047(G)     0.8488(d)     0.8657(G)     0.8929(d) 
924 [    ne    max   uint64] : auto speedups     1.0954(G)     1.1285(G)     1.1112(G)     1.1904(G) speedups     0.8876(G)     0.7941(d)     0.8671(G)     0.8192(d) 
925 [    ne    max   single] : auto speedups     0.9180(G)     0.9605(G)     0.9606(G)     0.9997(G) speedups     0.7996(G)     0.9322(d)     0.7713(G)     0.9654(d) 
926 [    ne    max   double] : auto speedups     0.8799(G)     0.8814(G)     0.9130(G)     0.9201(G) speedups     0.7737(G)     2.1977(d)     0.7626(G)     2.3141(d) 
927 [    ne   plus  logical] : auto speedups     1.1019(G)     1.1519(G)     1.1166(G)     1.1972(G) speedups     0.8595(G)     0.9487(d)     0.8265(G)     0.9732(d) 
928 [    ne   plus     int8] : auto speedups     1.0680(G)     1.1076(G)     1.0575(G)     1.1782(G) speedups     0.8160(G)     0.7989(d)     0.8094(G)     0.8655(d) 
929 [    ne   plus    uint8] : auto speedups     1.0738(G)     1.0983(G)     1.1000(G)     1.1633(G) speedups     0.7760(G)     0.7942(d)     0.7839(G)     0.8362(d) 
930 [    ne   plus    int16] : auto speedups     1.0798(G)     1.1224(G)     1.0872(G)     1.1913(G) speedups     0.9036(G)     0.8174(d)     0.9126(G)     0.8832(d) 
931 [    ne   plus   uint16] : auto speedups     1.0982(G)     1.1612(G)     1.1108(G)     1.2199(G) speedups     0.9446(G)     0.8292(d)     0.9227(G)     0.8886(d) 
932 [    ne   plus    int32] : auto speedups     1.1361(G)     1.1794(G)     1.1553(G)     1.2508(G) speedups     0.9057(G)     0.8365(d)     0.8860(G)     0.9027(d) 
933 [    ne   plus   uint32] : auto speedups     1.1325(G)     1.1594(G)     1.1476(G)     1.2188(G) speedups     0.9289(G)     0.8524(d)     0.9102(G)     0.8876(d) 
934 [    ne   plus    int64] : auto speedups     1.1050(G)     1.0795(G)     1.1172(G)     1.1771(G) speedups     0.8840(G)     0.8469(d)     0.8605(G)     0.8940(d) 
935 [    ne   plus   uint64] : auto speedups     1.1248(G)     1.1019(G)     1.1161(G)     1.1808(G) speedups     0.8851(G)     0.7970(d)     0.8563(G)     0.8368(d) 
936 [    ne   plus   single] : auto speedups     0.9921(G)     1.0174(G)     0.9582(G)     1.0769(G) speedups     0.8323(G)     0.9960(d)     0.8175(G)     1.0521(d) 
937 [    ne   plus   double] : auto speedups     0.9694(G)     0.9700(G)     0.9786(G)     1.0174(G) speedups     0.7692(G)     2.4432(d)     0.8111(G)     2.4822(d) 
938 [    ne  times  logical] : auto speedups     1.0713(G)     1.0772(G)     1.0858(G)     1.1739(G) speedups     0.9417(G)     1.0046(d)     0.9474(G)     1.0668(d) 
939 [    ne  times     int8] : auto speedups     1.0671(G)     1.0840(G)     1.0496(G)     1.1599(G) speedups     0.8404(G)     0.8513(d)     0.7814(G)     0.8904(d) 
940 [    ne  times    uint8] : auto speedups     1.0660(G)     1.0766(G)     1.0866(G)     1.1504(G) speedups     0.9003(G)     0.8622(d)     0.8882(G)     0.9029(d) 
941 [    ne  times    int16] : auto speedups     1.0831(G)     1.0719(G)     1.0709(G)     1.1635(G) speedups     0.8364(G)     0.8785(d)     0.8274(G)     0.9137(d) 
942 [    ne  times   uint16] : auto speedups     1.0628(G)     1.0905(G)     1.0660(G)     1.1291(G) speedups     0.9414(G)     0.8691(d)     0.9220(G)     0.8932(d) 
943 [    ne  times    int32] : auto speedups     1.0751(G)     1.0634(G)     1.0636(G)     1.1277(G) speedups     0.9423(G)     0.8743(d)     0.9204(G)     0.9055(d) 
944 [    ne  times   uint32] : auto speedups     1.0629(G)     1.0880(G)     1.0618(G)     1.1455(G) speedups     0.8950(G)     0.8697(d)     0.8841(G)     0.9003(d) 
945 [    ne  times    int64] : auto speedups     1.0301(G)     1.0521(G)     1.0265(G)     1.0875(G) speedups     0.9011(G)     0.8662(d)     0.8749(G)     0.9045(d) 
946 [    ne  times   uint64] : auto speedups     1.0308(G)     1.0378(G)     1.0397(G)     1.0871(G) speedups     0.9307(G)     0.8215(d)     0.9030(G)     0.8624(d) 
947 [    ne  times   single] : auto speedups     0.9767(G)     1.0132(G)     0.9976(G)     1.0732(G) speedups     0.8333(G)     0.9738(d)     0.8188(G)     1.0538(d) 
948 [    ne  times   double] : auto speedups     0.9644(G)     0.9703(G)     0.9726(G)     1.0268(G) speedups     0.8298(G)     2.3743(d)     0.8073(G)     2.4982(d) 
949 [    ne     or  logical] : auto speedups     1.1684(G)     1.1784(G)     1.1828(G)     1.2589(G) speedups     0.8871(G)     0.9938(d)     0.8733(G)     1.0524(d) 
950 [    ne     or     int8] : auto speedups     1.1117(G)     1.1523(G)     1.1232(G)     1.2275(G) speedups     0.8528(G)     0.8200(d)     0.8322(G)     0.8803(d) 
951 [    ne     or    uint8] : auto speedups     1.1293(G)     1.1481(G)     1.1525(G)     1.2403(G) speedups     0.8326(G)     0.8314(d)     0.8206(G)     0.8796(d) 
952 [    ne     or    int16] : auto speedups     1.1110(G)     1.1693(G)     1.1532(G)     1.2211(G) speedups     0.9298(G)     0.8378(d)     0.9115(G)     0.8825(d) 
953 [    ne     or   uint16] : auto speedups     1.1250(G)     1.1600(G)     1.1432(G)     1.2300(G) speedups     0.9538(G)     0.8457(d)     0.9332(G)     0.8772(d) 
954 [    ne     or    int32] : auto speedups     1.1692(G)     1.1419(G)     1.1419(G)     1.2323(G) speedups     0.8984(G)     0.8391(d)     0.8738(G)     0.8841(d) 
955 [    ne     or   uint32] : auto speedups     1.1340(G)     1.1922(G)     1.1391(G)     1.2020(G) speedups     0.9213(G)     0.8468(d)     0.9124(G)     0.8908(d) 
956 [    ne     or    int64] : auto speedups     1.1137(G)     1.1336(G)     1.1134(G)     1.1969(G) speedups     0.8887(G)     0.8484(d)     0.8755(G)     0.8965(d) 
957 [    ne     or   uint64] : auto speedups     1.0865(G)     1.1126(G)     1.1333(G)     1.1470(G) speedups     0.8917(G)     0.7984(d)     0.8605(G)     0.8381(d) 
958 [    ne     or   single] : auto speedups     0.9636(G)     1.0157(G)     0.9894(G)     1.0851(G) speedups     0.8275(G)     0.9903(d)     0.8241(G)     1.0528(d) 
959 [    ne     or   double] : auto speedups     0.9628(G)     0.9836(G)     0.9686(G)     1.0408(G) speedups     0.8246(G)     2.4409(d)     0.7895(G)     2.5085(d) 
960 [    ne    and  logical] : auto speedups     1.0869(G)     1.0857(G)     1.0829(G)     1.1694(G) speedups     0.9736(G)     1.0215(d)     0.9552(G)     1.0582(d) 
961 [    ne    and     int8] : auto speedups     1.0748(G)     1.0894(G)     1.0783(G)     1.1544(G) speedups     0.8329(G)     0.8496(d)     0.8209(G)     0.8999(d) 
962 [    ne    and    uint8] : auto speedups     1.0491(G)     1.0923(G)     1.0632(G)     1.1532(G) speedups     0.9090(G)     0.8455(d)     0.8960(G)     0.8986(d) 
963 [    ne    and    int16] : auto speedups     1.0775(G)     1.0669(G)     1.0840(G)     1.1622(G) speedups     0.8288(G)     0.8746(d)     0.8210(G)     0.9142(d) 
964 [    ne    and   uint16] : auto speedups     1.0771(G)     1.0721(G)     1.0841(G)     1.1476(G) speedups     0.9328(G)     0.8659(d)     0.9178(G)     0.9013(d) 
965 [    ne    and    int32] : auto speedups     1.0614(G)     1.0901(G)     1.0746(G)     1.1264(G) speedups     0.9380(G)     0.7589(d)     0.9002(G)     0.8731(d) 
966 [    ne    and   uint32] : auto speedups     1.0302(G)     1.0935(G)     1.0665(G)     1.1384(G) speedups     0.9081(G)     0.8769(d)     0.8810(G)     0.8885(d) 
967 [    ne    and    int64] : auto speedups     1.0261(G)     1.0497(G)     1.0154(G)     1.0977(G) speedups     0.8851(G)     0.8717(d)     0.8762(G)     0.8950(d) 
968 [    ne    and   uint64] : auto speedups     1.0300(G)     1.0596(G)     0.9873(G)     1.0650(G) speedups     0.8344(G)     0.7861(d)     0.8819(G)     0.8373(d) 
969 [    ne    and   single] : auto speedups     0.9737(G)     1.0144(G)     0.9888(G)     1.0546(G) speedups     0.8359(G)     0.9872(d)     0.8141(G)     1.0281(d) 
970 [    ne    and   double] : auto speedups     0.9695(G)     0.9845(G)     0.9694(G)     1.0407(G) speedups     0.8207(G)     2.3822(d)     0.8146(G)     2.3439(d) 
971 [    ne    xor  logical] : auto speedups     1.1119(G)     1.1212(G)     1.1003(G)     1.1505(G) speedups     0.8338(G)     1.0166(d)     0.8252(G)     1.0505(d) 
972 [    ne    xor     int8] : auto speedups     1.0902(G)     1.1083(G)     1.1035(G)     1.1743(G) speedups     0.9319(G)     0.8675(d)     0.9089(G)     0.9106(d) 
973 [    ne    xor    uint8] : auto speedups     1.0682(G)     1.1133(G)     1.0905(G)     1.1703(G) speedups     0.9097(G)     0.8802(d)     0.8835(G)     0.9134(d) 
974 [    ne    xor    int16] : auto speedups     1.0649(G)     1.0909(G)     1.0786(G)     1.1671(G) speedups     0.8790(G)     0.8733(d)     0.8584(G)     0.9206(d) 
975 [    ne    xor   uint16] : auto speedups     1.0657(G)     1.1115(G)     1.0862(G)     1.1629(G) speedups     0.9533(G)     0.8598(d)     0.9342(G)     0.9156(d) 
976 [    ne    xor    int32] : auto speedups     1.0757(G)     1.0969(G)     1.0730(G)     1.1432(G) speedups     0.8958(G)     0.8754(d)     0.8678(G)     0.9264(d) 
977 [    ne    xor   uint32] : auto speedups     1.0788(G)     1.0976(G)     1.0871(G)     1.1660(G) speedups     0.9657(G)     0.8670(d)     0.9569(G)     0.9346(d) 
978 [    ne    xor    int64] : auto speedups     1.0518(G)     1.0626(G)     1.0546(G)     1.0969(G) speedups     0.9192(G)     0.8719(d)     0.9339(G)     0.9280(d) 
979 [    ne    xor   uint64] : auto speedups     1.0306(G)     1.0482(G)     1.0505(G)     1.0955(G) speedups     0.8829(G)     0.8216(d)     0.8662(G)     0.8678(d) 
980 [    ne    xor   single] : auto speedups     1.0521(G)     1.0377(G)     1.0400(G)     1.0975(G) speedups     0.8639(G)     1.0063(d)     0.8467(G)     1.0471(d) 
981 [    ne    xor   double] : auto speedups     1.0287(G)     1.0247(G)     1.0379(G)     1.0922(G) speedups     0.9069(G)     2.4924(d)     0.8783(G)     2.5102(d) 
982 [    ne     eq  logical] : auto speedups     1.0821(G)     1.0896(G)     1.0929(G)     1.1669(G) speedups     0.9734(G)     1.0051(d)     0.9605(G)     1.0431(d) 
983 [    ne     eq     int8] : auto speedups     1.0688(G)     1.1021(G)     1.0897(G)     1.1630(G) speedups     0.9008(G)     0.8647(d)     0.8965(G)     0.8971(d) 
984 [    ne     eq    uint8] : auto speedups     1.0435(G)     1.0965(G)     1.0593(G)     1.1548(G) speedups     0.8796(G)     0.8562(d)     0.8637(G)     0.9081(d) 
985 [    ne     eq    int16] : auto speedups     1.0646(G)     1.0635(G)     1.0698(G)     1.1388(G) speedups     0.8867(G)     0.8503(d)     0.8547(G)     0.8930(d) 
986 [    ne     eq   uint16] : auto speedups     1.0686(G)     1.0784(G)     1.0727(G)     1.1688(G) speedups     0.9632(G)     0.8548(d)     0.9430(G)     0.8992(d) 
987 [    ne     eq    int32] : auto speedups     1.0562(G)     1.0935(G)     1.0701(G)     1.1507(G) speedups     0.9762(G)     0.8638(d)     0.9593(G)     0.9235(d) 
988 [    ne     eq   uint32] : auto speedups     1.0534(G)     1.0728(G)     1.0715(G)     1.1299(G) speedups     0.9039(G)     0.8612(d)     0.8899(G)     0.9019(d) 
989 [    ne     eq    int64] : auto speedups     1.0263(G)     1.0494(G)     1.0351(G)     1.0944(G) speedups     0.9422(G)     0.8831(d)     0.9403(G)     0.9062(d) 
990 [    ne     eq   uint64] : auto speedups     1.0289(G)     1.0376(G)     1.0297(G)     1.0906(G) speedups     0.8876(G)     0.8074(d)     0.8672(G)     0.8673(d) 
991 [    ne     eq   single] : auto speedups     1.0677(G)     1.0768(G)     1.0879(G)     1.1491(G) speedups     0.8924(G)     0.9747(d)     0.8825(G)     1.0308(d) 
992 [    ne     eq   double] : auto speedups     1.0454(G)     1.0564(G)     1.0387(G)     1.1064(G) speedups     0.9042(G)     2.4994(d)     0.8868(G)     2.5532(d) 
993 [    gt    min  logical] : auto speedups     1.0281(G)     1.0759(G)     1.0365(G)     1.1198(G) speedups     0.9550(G)     1.0205(d)     0.9296(G)     1.0721(d) 
994 [    gt    min     int8] : auto speedups     1.0787(G)     1.1011(G)     1.0745(G)     1.1637(G) speedups     0.8964(G)     0.8486(d)     0.8868(G)     0.9013(d) 
995 [    gt    min    uint8] : auto speedups     1.0354(G)     1.0727(G)     1.0520(G)     1.1134(G) speedups     0.8972(G)     0.8349(d)     0.8693(G)     0.8949(d) 
996 [    gt    min    int16] : auto speedups     1.0754(G)     1.0897(G)     1.0803(G)     1.1372(G) speedups     0.9087(G)     0.8583(d)     0.8949(G)     0.9038(d) 
997 [    gt    min   uint16] : auto speedups     1.0605(G)     1.0542(G)     1.0452(G)     1.1213(G) speedups     0.9008(G)     0.8722(d)     0.8787(G)     0.8835(d) 
998 [    gt    min    int32] : auto speedups     1.0656(G)     1.0910(G)     1.0661(G)     1.1633(G) speedups     0.9008(G)     0.8680(d)     0.8835(G)     0.9002(d) 
999 [    gt    min   uint32] : auto speedups     1.0167(G)     1.0261(G)     1.0313(G)     1.1069(G) speedups     0.8743(G)     0.8618(d)     0.8660(G)     0.9158(d) 
1000 [    gt    min    int64] : auto speedups     1.0057(G)     1.0662(G)     1.0286(G)     1.0942(G) speedups     0.9345(G)     0.8627(d)     0.8914(G)     0.9196(d) 
1001 [    gt    min   uint64] : auto speedups     1.0019(G)     1.0165(G)     1.0004(G)     1.0677(G) speedups     0.8620(G)     0.8149(d)     0.8287(G)     0.8552(d) 
1002 [    gt    min   single] : auto speedups     1.0430(G)     1.0643(G)     1.0347(G)     1.1141(G) speedups     0.9050(G)     1.0019(d)     0.8861(G)     1.0566(d) 
1003 [    gt    min   double] : auto speedups     1.0042(G)     1.0367(G)     1.0057(G)     1.0810(G) speedups     0.8131(G)     2.2966(d)     0.7947(G)     2.3016(d) 
1004 [    gt    max  logical] : auto speedups     1.0986(G)     1.1275(G)     1.1712(G)     1.2395(G) speedups     0.9120(G)     1.0168(d)     0.8959(G)     1.0436(d) 
1005 [    gt    max     int8] : auto speedups     1.1411(G)     1.1333(G)     1.1670(G)     1.2047(G) speedups     0.9101(G)     0.8272(d)     0.8817(G)     0.8651(d) 
1006 [    gt    max    uint8] : auto speedups     1.1441(G)     1.1415(G)     1.1490(G)     1.2208(G) speedups     0.8949(G)     0.8302(d)     0.8700(G)     0.8800(d) 
1007 [    gt    max    int16] : auto speedups     1.1189(G)     1.1588(G)     1.1374(G)     1.2371(G) speedups     0.9335(G)     0.8250(d)     0.9223(G)     0.8818(d) 
1008 [    gt    max   uint16] : auto speedups     1.1206(G)     1.1612(G)     1.0995(G)     1.2290(G) speedups     0.8168(G)     0.8308(d)     0.8111(G)     0.8836(d) 
1009 [    gt    max    int32] : auto speedups     1.1471(G)     1.1522(G)     1.1218(G)     1.2446(G) speedups     0.9005(G)     0.8399(d)     0.9292(G)     0.8883(d) 
1010 [    gt    max   uint32] : auto speedups     1.1233(G)     1.1578(G)     1.1449(G)     1.2218(G) speedups     0.9011(G)     0.8490(d)     0.8789(G)     0.8892(d) 
1011 [    gt    max    int64] : auto speedups     1.1037(G)     1.1010(G)     1.1149(G)     1.1971(G) speedups     0.9263(G)     0.8464(d)     0.8970(G)     0.8864(d) 
1012 [    gt    max   uint64] : auto speedups     1.0926(G)     1.1085(G)     1.1061(G)     1.1921(G) speedups     0.8912(G)     0.7890(d)     0.8706(G)     0.8328(d) 
1013 [    gt    max   single] : auto speedups     1.0186(G)     1.0400(G)     1.0169(G)     1.0962(G) speedups     0.9213(G)     0.9944(d)     0.9040(G)     1.0605(d) 
1014 [    gt    max   double] : auto speedups     0.9730(G)     0.9997(G)     1.0027(G)     1.0653(G) speedups     0.9346(G)     2.1899(d)     0.9065(G)     2.2576(d) 
1015 [    gt   plus  logical] : auto speedups     1.1540(G)     1.1731(G)     1.1142(G)     1.1656(G) speedups     0.8773(G)     0.9653(d)     0.8416(G)     1.0399(d) 
1016 [    gt   plus     int8] : auto speedups     1.1370(G)     1.1534(G)     1.1429(G)     1.2258(G) speedups     0.9009(G)     0.8324(d)     0.8856(G)     0.8761(d) 
1017 [    gt   plus    uint8] : auto speedups     1.1070(G)     1.1600(G)     1.1279(G)     1.2328(G) speedups     0.8910(G)     0.8405(d)     0.8665(G)     0.8782(d) 
1018 [    gt   plus    int16] : auto speedups     1.1320(G)     1.1567(G)     1.1620(G)     1.2196(G) speedups     0.9343(G)     0.8334(d)     0.9046(G)     0.8857(d) 
1019 [    gt   plus   uint16] : auto speedups     1.1175(G)     1.1607(G)     1.1347(G)     1.2177(G) speedups     0.8229(G)     0.8377(d)     0.8055(G)     0.8706(d) 
1020 [    gt   plus    int32] : auto speedups     1.1414(G)     1.1732(G)     1.1330(G)     1.2381(G) speedups     0.9415(G)     0.8401(d)     0.9280(G)     0.8818(d) 
1021 [    gt   plus   uint32] : auto speedups     1.1449(G)     1.1829(G)     1.1389(G)     1.2150(G) speedups     0.9083(G)     0.8112(d)     0.8788(G)     0.8898(d) 
1022 [    gt   plus    int64] : auto speedups     1.0844(G)     1.1258(G)     1.0100(G)     1.1775(G) speedups     0.9151(G)     0.8391(d)     0.8944(G)     0.8756(d) 
1023 [    gt   plus   uint64] : auto speedups     1.1104(G)     1.1192(G)     1.1049(G)     1.1817(G) speedups     0.8790(G)     0.7927(d)     0.8741(G)     0.8234(d) 
1024 [    gt   plus   single] : auto speedups     1.0272(G)     1.0441(G)     1.0355(G)     1.0888(G) speedups     0.9344(G)     0.9880(d)     0.9111(G)     1.0409(d) 
1025 [    gt   plus   double] : auto speedups     0.9969(G)     1.0134(G)     1.0161(G)     1.0736(G) speedups     0.9351(G)     2.0807(d)     0.9220(G)     2.1480(d) 
1026 [    gt  times  logical] : auto speedups     1.0559(G)     1.0653(G)     1.0515(G)     1.1288(G) speedups     0.9659(G)     1.0293(d)     0.9345(G)     1.0717(d) 
1027 [    gt  times     int8] : auto speedups     1.0554(G)     1.0935(G)     1.0848(G)     1.1568(G) speedups     0.9065(G)     0.8615(d)     0.8943(G)     0.8863(d) 
1028 [    gt  times    uint8] : auto speedups     1.0449(G)     1.0745(G)     1.0243(G)     1.1183(G) speedups     0.8877(G)     0.8572(d)     0.8760(G)     0.8940(d) 
1029 [    gt  times    int16] : auto speedups     1.0369(G)     1.0803(G)     1.0775(G)     1.1503(G) speedups     0.9324(G)     0.8611(d)     0.9021(G)     0.9020(d) 
1030 [    gt  times   uint16] : auto speedups     1.0436(G)     1.0623(G)     1.0273(G)     1.1305(G) speedups     0.8953(G)     0.8630(d)     0.8758(G)     0.9018(d) 
1031 [    gt  times    int32] : auto speedups     1.0606(G)     1.0818(G)     1.0817(G)     1.1364(G) speedups     0.9181(G)     0.8752(d)     0.8842(G)     0.9065(d) 
1032 [    gt  times   uint32] : auto speedups     1.0385(G)     1.0390(G)     1.0425(G)     1.1127(G) speedups     0.8702(G)     0.8639(d)     0.8425(G)     0.9039(d) 
1033 [    gt  times    int64] : auto speedups     1.0224(G)     1.0555(G)     1.0332(G)     1.1217(G) speedups     0.9205(G)     0.8645(d)     0.9075(G)     0.9051(d) 
1034 [    gt  times   uint64] : auto speedups     0.9878(G)     1.0309(G)     1.0255(G)     1.0712(G) speedups     0.8541(G)     0.8150(d)     0.8302(G)     0.8520(d) 
1035 [    gt  times   single] : auto speedups     1.0319(G)     1.0521(G)     1.0372(G)     1.1102(G) speedups     0.9064(G)     1.0111(d)     0.8839(G)     1.0539(d) 
1036 [    gt  times   double] : auto speedups     1.0128(G)     1.0230(G)     1.0189(G)     1.0797(G) speedups     0.8225(G)     2.2604(d)     0.8049(G)     2.3440(d) 
1037 [    gt     or  logical] : auto speedups     1.1332(G)     1.1691(G)     1.1528(G)     1.2463(G) speedups     0.9185(G)     1.0043(d)     0.8843(G)     1.0482(d) 
1038 [    gt     or     int8] : auto speedups     1.1417(G)     1.1559(G)     1.1595(G)     1.2153(G) speedups     0.9063(G)     0.8312(d)     0.8872(G)     0.8722(d) 
1039 [    gt     or    uint8] : auto speedups     1.1467(G)     1.1639(G)     1.1334(G)     1.2218(G) speedups     0.8872(G)     0.8356(d)     0.8596(G)     0.8762(d) 
1040 [    gt     or    int16] : auto speedups     1.1265(G)     1.1426(G)     1.1444(G)     1.1999(G) speedups     0.9090(G)     0.8089(d)     0.9098(G)     0.8785(d) 
1041 [    gt     or   uint16] : auto speedups     1.1057(G)     1.1723(G)     1.1491(G)     1.2285(G) speedups     0.8284(G)     0.8443(d)     0.8086(G)     0.8830(d) 
1042 [    gt     or    int32] : auto speedups     1.1436(G)     1.1684(G)     1.1497(G)     1.2184(G) speedups     0.9352(G)     0.8388(d)     0.9275(G)     0.8871(d) 
1043 [    gt     or   uint32] : auto speedups     1.1665(G)     1.1606(G)     1.1688(G)     1.2187(G) speedups     0.9117(G)     0.8407(d)     0.8857(G)     0.8825(d) 
1044 [    gt     or    int64] : auto speedups     1.1047(G)     1.1163(G)     1.1162(G)     1.1722(G) speedups     0.9176(G)     0.8513(d)     0.9018(G)     0.8996(d) 
1045 [    gt     or   uint64] : auto speedups     1.1081(G)     1.1238(G)     1.1295(G)     1.1610(G) speedups     0.8893(G)     0.7939(d)     0.8766(G)     0.8286(d) 
1046 [    gt     or   single] : auto speedups     1.0114(G)     1.0602(G)     1.0243(G)     1.1155(G) speedups     0.9238(G)     0.9998(d)     0.8974(G)     1.0543(d) 
1047 [    gt     or   double] : auto speedups     0.9807(G)     1.0022(G)     1.0139(G)     1.0653(G) speedups     0.9425(G)     2.1382(d)     0.9213(G)     2.2346(d) 
1048 [    gt    and  logical] : auto speedups     1.0398(G)     1.0730(G)     1.0484(G)     1.1268(G) speedups     0.9445(G)     1.0276(d)     0.9320(G)     1.0687(d) 
1049 [    gt    and     int8] : auto speedups     1.0700(G)     1.0658(G)     1.0736(G)     1.1241(G) speedups     0.9069(G)     0.8318(d)     0.8808(G)     0.8973(d) 
1050 [    gt    and    uint8] : auto speedups     1.0463(G)     1.0565(G)     1.0601(G)     1.1286(G) speedups     0.8835(G)     0.8436(d)     0.8769(G)     0.8715(d) 
1051 [    gt    and    int16] : auto speedups     1.0479(G)     1.0788(G)     1.0758(G)     1.1321(G) speedups     0.9112(G)     0.8595(d)     0.8963(G)     0.9025(d) 
1052 [    gt    and   uint16] : auto speedups     1.0309(G)     1.0690(G)     1.0437(G)     1.1187(G) speedups     0.8975(G)     0.8659(d)     0.8774(G)     0.9049(d) 
1053 [    gt    and    int32] : auto speedups     1.0679(G)     1.0813(G)     1.0548(G)     1.1363(G) speedups     0.9021(G)     0.8634(d)     0.8876(G)     0.9141(d) 
1054 [    gt    and   uint32] : auto speedups     1.0399(G)     1.0398(G)     1.0363(G)     1.1082(G) speedups     0.8712(G)     0.8778(d)     0.8437(G)     0.8951(d) 
1055 [    gt    and    int64] : auto speedups     1.0400(G)     1.0209(G)     0.9675(G)     1.0367(G) speedups     0.8597(G)     0.8197(d)     0.8659(G)     0.8330(d) 
1056 [    gt    and   uint64] : auto speedups     0.9285(G)     0.9133(G)     0.9116(G)     0.9967(G) speedups     0.8004(G)     0.7685(d)     0.7843(G)     0.7657(d) 
1057 [    gt    and   single] : auto speedups     0.9386(G)     0.9972(G)     0.9951(G)     1.0608(G) speedups     0.8521(G)     0.9646(d)     0.8491(G)     1.0148(d) 
1058 [    gt    and   double] : auto speedups     0.9917(G)     1.0020(G)     1.0023(G)     1.0744(G) speedups     0.8242(G)     2.2766(d)     0.7943(G)     2.3622(d) 
1059 [    gt    xor  logical] : auto speedups     1.0822(G)     1.0831(G)     1.0812(G)     1.1571(G) speedups     0.9558(G)     0.9794(d)     0.9395(G)     1.0527(d) 
1060 [    gt    xor     int8] : auto speedups     1.0819(G)     1.1104(G)     1.0994(G)     1.1512(G) speedups     0.8977(G)     0.8639(d)     0.8666(G)     0.8966(d) 
1061 [    gt    xor    uint8] : auto speedups     1.0614(G)     1.0904(G)     1.0646(G)     1.1595(G) speedups     0.9412(G)     0.8595(d)     0.9269(G)     0.9231(d) 
1062 [    gt    xor    int16] : auto speedups     1.0776(G)     1.1149(G)     1.0936(G)     1.1642(G) speedups     0.8866(G)     0.8569(d)     0.8521(G)     0.9122(d) 
1063 [    gt    xor   uint16] : auto speedups     1.0495(G)     1.0637(G)     1.0737(G)     1.1273(G) speedups     0.9619(G)     0.8799(d)     0.9431(G)     0.9248(d) 
1064 [    gt    xor    int32] : auto speedups     1.0818(G)     1.0903(G)     1.0986(G)     1.1442(G) speedups     0.9635(G)     0.8787(d)     0.9457(G)     0.9097(d) 
1065 [    gt    xor   uint32] : auto speedups     0.9967(G)     1.0013(G)     1.0000(G)     1.0596(G) speedups     0.9154(G)     0.8725(d)     0.9024(G)     0.9060(d) 
1066 [    gt    xor    int64] : auto speedups     1.0345(G)     1.0439(G)     1.0380(G)     1.1199(G) speedups     0.9545(G)     0.8654(d)     0.9243(G)     0.9233(d) 
1067 [    gt    xor   uint64] : auto speedups     0.9522(G)     0.9729(G)     0.9684(G)     1.0294(G) speedups     0.8903(G)     0.8335(d)     0.8784(G)     0.8597(d) 
1068 [    gt    xor   single] : auto speedups     1.0151(G)     1.0212(G)     1.0122(G)     1.0903(G) speedups     0.8654(G)     0.9572(d)     0.8429(G)     1.0590(d) 
1069 [    gt    xor   double] : auto speedups     0.9864(G)     1.0046(G)     0.9908(G)     1.0416(G) speedups     0.9197(G)     2.1365(d)     0.8910(G)     2.1877(d) 
1070 [    gt     eq  logical] : auto speedups     1.0401(G)     1.0694(G)     1.0524(G)     1.1346(G) speedups     0.9634(G)     0.9887(d)     0.9520(G)     1.0329(d) 
1071 [    gt     eq     int8] : auto speedups     1.0362(G)     1.0999(G)     1.0441(G)     1.1488(G) speedups     0.8937(G)     0.8561(d)     0.8710(G)     0.9086(d) 
1072 [    gt     eq    uint8] : auto speedups     1.0444(G)     1.0501(G)     1.0509(G)     1.1203(G) speedups     0.9175(G)     0.8568(d)     0.8949(G)     0.9099(d) 
1073 [    gt     eq    int16] : auto speedups     1.0699(G)     1.0927(G)     1.0789(G)     1.0837(G) speedups     0.8777(G)     0.8571(d)     0.8623(G)     0.8945(d) 
1074 [    gt     eq   uint16] : auto speedups     1.0577(G)     1.0516(G)     1.0565(G)     1.1152(G) speedups     0.9159(G)     0.8755(d)     0.8807(G)     0.9099(d) 
1075 [    gt     eq    int32] : auto speedups     1.0467(G)     1.0669(G)     1.0408(G)     1.1256(G) speedups     0.8949(G)     0.8694(d)     0.8624(G)     0.9038(d) 
1076 [    gt     eq   uint32] : auto speedups     0.9661(G)     0.9829(G)     0.9735(G)     1.0452(G) speedups     0.9561(G)     0.8777(d)     0.9187(G)     0.9017(d) 
1077 [    gt     eq    int64] : auto speedups     1.0323(G)     1.0452(G)     1.0335(G)     1.1103(G) speedups     0.8783(G)     0.8833(d)     0.8681(G)     0.9081(d) 
1078 [    gt     eq   uint64] : auto speedups     0.9532(G)     0.9664(G)     0.9450(G)     1.0157(G) speedups     0.9113(G)     0.8239(d)     0.8724(G)     0.8494(d) 
1079 [    gt     eq   single] : auto speedups     1.0085(G)     1.0322(G)     1.0023(G)     1.0918(G) speedups     0.9997(G)     0.9838(d)     0.9669(G)     0.9967(d) 
1080 [    gt     eq   double] : auto speedups     0.9707(G)     1.0062(G)     0.9828(G)     1.0573(G) speedups     0.9290(G)     2.1535(d)     0.9067(G)     2.1450(d) 
1081 [    lt    min  logical] : auto speedups     1.0413(G)     1.0698(G)     1.0696(G)     1.0989(G) speedups     0.9068(G)     1.0240(d)     0.8878(G)     1.0801(d) 
1082 [    lt    min     int8] : auto speedups     1.0688(G)     1.0795(G)     1.0865(G)     1.1491(G) speedups     0.9453(G)     0.8559(d)     0.9191(G)     0.9022(d) 
1083 [    lt    min    uint8] : auto speedups     1.0371(G)     1.0671(G)     1.0554(G)     1.1392(G) speedups     0.8595(G)     0.8557(d)     0.8399(G)     0.8961(d) 
1084 [    lt    min    int16] : auto speedups     1.0377(G)     1.0841(G)     1.0734(G)     1.1486(G) speedups     0.9173(G)     0.8598(d)     0.9084(G)     0.8994(d) 
1085 [    lt    min   uint16] : auto speedups     1.0426(G)     1.0533(G)     1.0448(G)     1.1142(G) speedups     0.8411(G)     0.8674(d)     0.8214(G)     0.9067(d) 
1086 [    lt    min    int32] : auto speedups     1.0570(G)     1.0828(G)     1.0558(G)     1.1458(G) speedups     0.8952(G)     0.8531(d)     0.8859(G)     0.8993(d) 
1087 [    lt    min   uint32] : auto speedups     1.0009(G)     1.0348(G)     1.0232(G)     1.0934(G) speedups     0.8125(G)     0.8657(d)     0.7981(G)     0.9143(d) 
1088 [    lt    min    int64] : auto speedups     1.0394(G)     1.0339(G)     1.0133(G)     1.1142(G) speedups     0.9156(G)     0.7944(d)     0.8603(G)     0.8502(d) 
1089 [    lt    min   uint64] : auto speedups     1.0007(G)     0.9801(G)     1.0147(G)     1.0587(G) speedups     0.8000(G)     0.8063(d)     0.7900(G)     0.8391(d) 
1090 [    lt    min   single] : auto speedups     1.0005(G)     1.0575(G)     1.0314(G)     1.0901(G) speedups     0.8945(G)     1.0007(d)     0.8817(G)     1.0377(d) 
1091 [    lt    min   double] : auto speedups     0.9839(G)     1.0198(G)     0.9655(G)     1.0640(G) speedups     0.9456(G)     2.1540(d)     0.9114(G)     2.1566(d) 
1092 [    lt    max  logical] : auto speedups     1.1314(G)     1.1714(G)     1.1317(G)     1.2316(G) speedups     0.9093(G)     1.0014(d)     0.8962(G)     1.0457(d) 
1093 [    lt    max     int8] : auto speedups     1.1390(G)     1.1586(G)     1.1409(G)     1.2330(G) speedups     0.8415(G)     0.8323(d)     0.8250(G)     0.8800(d) 
1094 [    lt    max    uint8] : auto speedups     1.1368(G)     1.1595(G)     1.1496(G)     1.2163(G) speedups     0.8367(G)     0.8404(d)     0.8196(G)     0.8772(d) 
1095 [    lt    max    int16] : auto speedups     1.1271(G)     1.1397(G)     1.1362(G)     1.2336(G) speedups     0.8368(G)     0.8380(d)     0.8158(G)     0.8896(d) 
1096 [    lt    max   uint16] : auto speedups     1.1442(G)     1.1543(G)     1.1512(G)     1.2187(G) speedups     0.9651(G)     0.8442(d)     0.9468(G)     0.8798(d) 
1097 [    lt    max    int32] : auto speedups     1.1531(G)     1.1607(G)     1.1413(G)     1.2412(G) speedups     0.9066(G)     0.8590(d)     0.8831(G)     0.8805(d) 
1098 [    lt    max   uint32] : auto speedups     1.1420(G)     1.1699(G)     1.1346(G)     1.1995(G) speedups     0.9370(G)     0.8314(d)     0.9187(G)     0.8946(d) 
1099 [    lt    max    int64] : auto speedups     1.1175(G)     1.1308(G)     1.1171(G)     1.1777(G) speedups     0.8806(G)     0.8447(d)     0.8568(G)     0.8641(d) 
1100 [    lt    max   uint64] : auto speedups     1.0829(G)     1.1347(G)     1.1095(G)     1.1639(G) speedups     0.9097(G)     0.8005(d)     0.9050(G)     0.8305(d) 
1101 [    lt    max   single] : auto speedups     1.0311(G)     1.0278(G)     1.0346(G)     1.1064(G) speedups     0.8414(G)     0.9858(d)     0.8280(G)     1.0384(d) 
1102 [    lt    max   double] : auto speedups     0.9920(G)     1.0158(G)     1.0110(G)     1.0605(G) speedups     0.8212(G)     2.1993(d)     0.8021(G)     2.2505(d) 
1103 [    lt   plus  logical] : auto speedups     1.1506(G)     1.1619(G)     1.1670(G)     1.2200(G) speedups     0.9028(G)     1.0131(d)     0.8835(G)     1.0510(d) 
1104 [    lt   plus     int8] : auto speedups     1.1438(G)     1.0973(G)     1.1460(G)     1.2197(G) speedups     0.8492(G)     0.8318(d)     0.8186(G)     0.8761(d) 
1105 [    lt   plus    uint8] : auto speedups     1.1400(G)     1.1734(G)     1.1222(G)     1.2348(G) speedups     0.8407(G)     0.8247(d)     0.8215(G)     0.8837(d) 
1106 [    lt   plus    int16] : auto speedups     1.1198(G)     1.1752(G)     1.1401(G)     1.2314(G) speedups     0.8193(G)     0.8285(d)     0.8121(G)     0.8768(d) 
1107 [    lt   plus   uint16] : auto speedups     1.1505(G)     1.1586(G)     1.1352(G)     1.2206(G) speedups     0.9535(G)     0.8380(d)     0.9427(G)     0.8733(d) 
1108 [    lt   plus    int32] : auto speedups     1.1602(G)     1.1427(G)     1.1800(G)     1.2028(G) speedups     0.9041(G)     0.8403(d)     0.8757(G)     0.8645(d) 
1109 [    lt   plus   uint32] : auto speedups     1.1559(G)     1.1782(G)     1.1604(G)     1.2337(G) speedups     0.9209(G)     0.8422(d)     0.9249(G)     0.8839(d) 
1110 [    lt   plus    int64] : auto speedups     1.0823(G)     1.1118(G)     1.1067(G)     1.1822(G) speedups     0.8894(G)     0.8414(d)     0.8449(G)     0.8823(d) 
1111 [    lt   plus   uint64] : auto speedups     1.1218(G)     1.1035(G)     1.1365(G)     1.1900(G) speedups     0.9329(G)     0.7910(d)     0.9009(G)     0.8309(d) 
1112 [    lt   plus   single] : auto speedups     0.9883(G)     1.0491(G)     1.0286(G)     1.0963(G) speedups     0.8657(G)     0.9980(d)     0.8426(G)     1.0376(d) 
1113 [    lt   plus   double] : auto speedups     1.0020(G)     1.0316(G)     1.0151(G)     1.0655(G) speedups     0.8119(G)     2.1828(d)     0.7915(G)     2.2810(d) 
1114 [    lt  times  logical] : auto speedups     1.0456(G)     1.0702(G)     1.0612(G)     1.1314(G) speedups     0.9108(G)     1.0319(d)     0.8890(G)     1.0642(d) 
1115 [    lt  times     int8] : auto speedups     1.0725(G)     1.1070(G)     1.0857(G)     1.1480(G) speedups     0.9469(G)     0.8541(d)     0.9188(G)     0.8964(d) 
1116 [    lt  times    uint8] : auto speedups     1.0003(G)     1.0692(G)     1.0345(G)     1.0233(G) speedups     0.8562(G)     0.8596(d)     0.8328(G)     0.8781(d) 
1117 [    lt  times    int16] : auto speedups     1.0611(G)     1.0906(G)     1.0652(G)     1.1459(G) speedups     0.9263(G)     0.8663(d)     0.8995(G)     0.9085(d) 
1118 [    lt  times   uint16] : auto speedups     1.0322(G)     1.0563(G)     1.0341(G)     1.1300(G) speedups     0.8384(G)     0.8538(d)     0.8212(G)     0.8979(d) 
1119 [    lt  times    int32] : auto speedups     1.0689(G)     1.0822(G)     1.0876(G)     1.1534(G) speedups     0.9032(G)     0.8568(d)     0.8806(G)     0.9088(d) 
1120 [    lt  times   uint32] : auto speedups     1.0255(G)     1.0489(G)     1.0227(G)     1.1081(G) speedups     0.8162(G)     0.8479(d)     0.8038(G)     0.8870(d) 
1121 [    lt  times    int64] : auto speedups     1.0392(G)     1.0468(G)     1.0479(G)     1.1005(G) speedups     0.9264(G)     0.8669(d)     0.9180(G)     0.8872(d) 
1122 [    lt  times   uint64] : auto speedups     1.0034(G)     1.0050(G)     1.0051(G)     1.0596(G) speedups     0.8104(G)     0.8081(d)     0.7846(G)     0.8547(d) 
1123 [    lt  times   single] : auto speedups     1.0157(G)     1.0311(G)     1.0303(G)     1.1072(G) speedups     0.8929(G)     0.9819(d)     0.8753(G)     1.0615(d) 
1124 [    lt  times   double] : auto speedups     0.9994(G)     1.0140(G)     0.9994(G)     1.0644(G) speedups     0.9352(G)     2.1985(d)     0.9269(G)     2.1707(d) 
1125 [    lt     or  logical] : auto speedups     1.1622(G)     1.1453(G)     1.1701(G)     1.2339(G) speedups     0.9178(G)     0.9995(d)     0.8966(G)     1.0479(d) 
1126 [    lt     or     int8] : auto speedups     1.1228(G)     1.1608(G)     1.1565(G)     1.2271(G) speedups     0.8402(G)     0.8368(d)     0.8231(G)     0.8749(d) 
1127 [    lt     or    uint8] : auto speedups     1.1331(G)     1.1629(G)     1.1404(G)     1.2310(G) speedups     0.8320(G)     0.8226(d)     0.8212(G)     0.8799(d) 
1128 [    lt     or    int16] : auto speedups     1.1113(G)     1.1461(G)     1.1435(G)     1.1922(G) speedups     0.8292(G)     0.8171(d)     0.8036(G)     0.8738(d) 
1129 [    lt     or   uint16] : auto speedups     1.1372(G)     1.1411(G)     1.1475(G)     1.2005(G) speedups     0.9694(G)     0.8449(d)     0.9436(G)     0.8817(d) 
1130 [    lt     or    int32] : auto speedups     1.1332(G)     1.1529(G)     1.1466(G)     1.2244(G) speedups     0.9058(G)     0.8426(d)     0.8799(G)     0.8940(d) 
1131 [    lt     or   uint32] : auto speedups     1.1539(G)     1.1513(G)     1.1643(G)     1.2310(G) speedups     0.9192(G)     0.8457(d)     0.9098(G)     0.8800(d) 
1132 [    lt     or    int64] : auto speedups     1.0883(G)     1.1097(G)     1.1230(G)     1.1541(G) speedups     0.8878(G)     0.8106(d)     0.8587(G)     0.8782(d) 
1133 [    lt     or   uint64] : auto speedups     1.0580(G)     1.1119(G)     1.0994(G)     1.1910(G) speedups     0.9147(G)     0.7909(d)     0.8954(G)     0.8322(d) 
1134 [    lt     or   single] : auto speedups     1.0097(G)     1.0284(G)     1.0563(G)     1.0955(G) speedups     0.8528(G)     0.9965(d)     0.8367(G)     1.0359(d) 
1135 [    lt     or   double] : auto speedups     0.9835(G)     1.0194(G)     1.0006(G)     1.0883(G) speedups     0.8240(G)     2.2270(d)     0.7976(G)     2.2983(d) 
1136 [    lt    and  logical] : auto speedups     1.0341(G)     1.0690(G)     1.0685(G)     1.1272(G) speedups     0.9121(G)     1.0315(d)     0.8822(G)     1.0721(d) 
1137 [    lt    and     int8] : auto speedups     1.0534(G)     1.0976(G)     1.0728(G)     1.1404(G) speedups     0.9453(G)     0.8566(d)     0.9173(G)     0.8964(d) 
1138 [    lt    and    uint8] : auto speedups     1.0385(G)     1.0598(G)     1.0512(G)     1.1257(G) speedups     0.8609(G)     0.8548(d)     0.8432(G)     0.8457(d) 
1139 [    lt    and    int16] : auto speedups     1.0033(G)     1.0191(G)     0.9680(G)     1.0912(G) speedups     0.9296(G)     0.8578(d)     0.8979(G)     0.9009(d) 
1140 [    lt    and   uint16] : auto speedups     1.0410(G)     1.0718(G)     1.0350(G)     1.1037(G) speedups     0.8370(G)     0.8686(d)     0.8159(G)     0.8995(d) 
1141 [    lt    and    int32] : auto speedups     1.0686(G)     1.0711(G)     1.0640(G)     1.1649(G) speedups     0.8944(G)     0.8797(d)     0.8816(G)     0.9048(d) 
1142 [    lt    and   uint32] : auto speedups     1.0268(G)     1.0500(G)     1.0409(G)     1.1016(G) speedups     0.8077(G)     0.8649(d)     0.7859(G)     0.9041(d) 
1143 [    lt    and    int64] : auto speedups     1.0356(G)     1.0120(G)     1.0236(G)     1.0866(G) speedups     0.9151(G)     0.8681(d)     0.9137(G)     0.8937(d) 
1144 [    lt    and   uint64] : auto speedups     0.9977(G)     1.0208(G)     0.9794(G)     1.0377(G) speedups     0.7990(G)     0.8038(d)     0.7819(G)     0.8050(d) 
1145 [    lt    and   single] : auto speedups     1.0284(G)     1.0445(G)     1.0398(G)     1.1013(G) speedups     0.8907(G)     0.8382(d)     0.8700(G)     1.0503(d) 
1146 [    lt    and   double] : auto speedups     1.0154(G)     1.0028(G)     1.0030(G)     1.0632(G) speedups     0.9350(G)     2.1951(d)     0.9168(G)     2.2876(d) 
1147 [    lt    xor  logical] : auto speedups     1.0368(G)     1.0967(G)     1.0551(G)     1.1442(G) speedups     0.8399(G)     1.0183(d)     0.8119(G)     1.0709(d) 
1148 [    lt    xor     int8] : auto speedups     1.0786(G)     1.1045(G)     1.0984(G)     1.0972(G) speedups     0.8834(G)     0.8634(d)     0.8725(G)     0.9137(d) 
1149 [    lt    xor    uint8] : auto speedups     1.0415(G)     1.0761(G)     1.0546(G)     1.1516(G) speedups     0.9217(G)     0.8690(d)     0.8991(G)     0.9231(d) 
1150 [    lt    xor    int16] : auto speedups     1.0679(G)     1.1103(G)     1.0625(G)     1.1624(G) speedups     0.8838(G)     0.8616(d)     0.8646(G)     0.9141(d) 
1151 [    lt    xor   uint16] : auto speedups     1.0523(G)     1.0462(G)     1.0767(G)     1.1242(G) speedups     0.9069(G)     0.8680(d)     0.9008(G)     0.8919(d) 
1152 [    lt    xor    int32] : auto speedups     1.0825(G)     1.0910(G)     1.0816(G)     1.1707(G) speedups     0.9567(G)     0.8813(d)     0.9433(G)     0.9073(d) 
1153 [    lt    xor   uint32] : auto speedups     0.9740(G)     1.0128(G)     0.9858(G)     1.0424(G) speedups     0.9213(G)     0.8809(d)     0.8646(G)     0.9297(d) 
1154 [    lt    xor    int64] : auto speedups     1.0285(G)     1.0622(G)     1.0295(G)     1.1063(G) speedups     0.9434(G)     0.8618(d)     0.9253(G)     0.9191(d) 
1155 [    lt    xor   uint64] : auto speedups     0.9538(G)     0.9735(G)     0.9639(G)     1.0187(G) speedups     0.8970(G)     0.8233(d)     0.8826(G)     0.8760(d) 
1156 [    lt    xor   single] : auto speedups     1.0168(G)     1.0269(G)     1.0276(G)     1.1018(G) speedups     0.9157(G)     0.9835(d)     0.9054(G)     1.0650(d) 
1157 [    lt    xor   double] : auto speedups     0.9771(G)     0.9995(G)     0.9811(G)     1.0592(G) speedups     0.8543(G)     2.0510(d)     0.8341(G)     2.2323(d) 
1158 [    lt     eq  logical] : auto speedups     1.0454(G)     1.0603(G)     1.0705(G)     1.1410(G) speedups     0.8399(G)     0.9784(d)     0.8200(G)     1.0697(d) 
1159 [    lt     eq     int8] : auto speedups     1.0519(G)     1.0824(G)     1.0661(G)     1.1398(G) speedups     0.8794(G)     0.8514(d)     0.8593(G)     0.9276(d) 
1160 [    lt     eq    uint8] : auto speedups     1.0410(G)     1.0350(G)     1.0668(G)     1.1090(G) speedups     0.9280(G)     0.8560(d)     0.9022(G)     0.9014(d) 
1161 [    lt     eq    int16] : auto speedups     1.0705(G)     1.0647(G)     1.0736(G)     1.1459(G) speedups     0.9360(G)     0.8588(d)     0.9165(G)     0.8960(d) 
1162 [    lt     eq   uint16] : auto speedups     1.0158(G)     1.0635(G)     1.0574(G)     1.1230(G) speedups     0.9073(G)     0.8589(d)     0.8861(G)     0.9033(d) 
1163 [    lt     eq    int32] : auto speedups     0.7943(G)     1.0622(G)     1.0743(G)     1.1316(G) speedups     0.9592(G)     0.8725(d)     0.9486(G)     0.9135(d) 
1164 [    lt     eq   uint32] : auto speedups     0.9622(G)     1.0075(G)     0.9709(G)     1.0577(G) speedups     0.9255(G)     0.8743(d)     0.9096(G)     0.9188(d) 
1165 [    lt     eq    int64] : auto speedups     1.0270(G)     1.0483(G)     1.0456(G)     1.0683(G) speedups     0.9482(G)     0.8637(d)     0.9113(G)     0.9112(d) 
1166 [    lt     eq   uint64] : auto speedups     0.9527(G)     0.9669(G)     0.9534(G)     1.0284(G) speedups     0.8975(G)     0.8252(d)     0.8913(G)     0.8674(d) 
1167 [    lt     eq   single] : auto speedups     1.0045(G)     1.0223(G)     1.0030(G)     1.0847(G) speedups     0.8496(G)     0.9779(d)     0.8334(G)     1.0324(d) 
1168 [    lt     eq   double] : auto speedups     0.9547(G)     1.0080(G)     0.9768(G)     1.0459(G) speedups     0.9425(G)     2.0779(d)     0.9233(G)     2.1336(d) 
1169 [    ge    min  logical] : auto speedups     1.0683(G)     1.0644(G)     1.0939(G)     1.1518(G) speedups     0.8235(G)     1.0036(d)     0.8127(G)     1.0236(d) 
1170 [    ge    min     int8] : auto speedups     1.0728(G)     1.0718(G)     1.0718(G)     1.1428(G) speedups     0.8465(G)     0.8180(d)     0.8261(G)     0.8749(d) 
1171 [    ge    min    uint8] : auto speedups     1.0560(G)     1.0902(G)     1.0574(G)     1.1397(G) speedups     0.8301(G)     0.8143(d)     0.8133(G)     0.8517(d) 
1172 [    ge    min    int16] : auto speedups     1.0745(G)     1.0852(G)     1.0448(G)     1.1664(G) speedups     0.8364(G)     0.8249(d)     0.8288(G)     0.8678(d) 
1173 [    ge    min   uint16] : auto speedups     1.0233(G)     1.0896(G)     1.0631(G)     1.1137(G) speedups     0.9219(G)     0.7978(d)     0.8909(G)     0.8533(d) 
1174 [    ge    min    int32] : auto speedups     1.0719(G)     1.0753(G)     1.0707(G)     1.1496(G) speedups     0.9320(G)     0.8484(d)     0.9085(G)     0.8884(d) 
1175 [    ge    min   uint32] : auto speedups     1.0628(G)     1.0862(G)     1.0653(G)     1.1280(G) speedups     0.9331(G)     0.8439(d)     0.9219(G)     0.8806(d) 
1176 [    ge    min    int64] : auto speedups     1.0246(G)     1.0442(G)     1.0471(G)     1.0824(G) speedups     0.8939(G)     0.8325(d)     0.8611(G)     0.8744(d) 
1177 [    ge    min   uint64] : auto speedups     1.0248(G)     1.0287(G)     1.0254(G)     1.0999(G) speedups     0.8862(G)     0.7860(d)     0.8733(G)     0.8295(d) 
1178 [    ge    min   single] : auto speedups     1.0685(G)     1.0915(G)     1.0805(G)     1.1322(G) speedups     0.9433(G)     1.0155(d)     0.9153(G)     1.0593(d) 
1179 [    ge    min   double] : auto speedups     1.0494(G)     1.0514(G)     1.0434(G)     1.0758(G) speedups     0.9491(G)     2.2504(d)     0.9223(G)     2.3177(d) 
1180 [    ge    max  logical] : auto speedups     1.1397(G)     1.1804(G)     1.1516(G)     1.2218(G) speedups     0.8492(G)     1.0273(d)     0.8243(G)     1.0694(d) 
1181 [    ge    max     int8] : auto speedups     1.1355(G)     1.1266(G)     1.1338(G)     1.2319(G) speedups     0.9114(G)     0.7920(d)     0.8658(G)     0.7909(d) 
1182 [    ge    max    uint8] : auto speedups     1.0828(G)     1.0784(G)     1.0128(G)     1.1242(G) speedups     0.8547(G)     0.8107(d)     0.8106(G)     0.8032(d) 
1183 [    ge    max    int16] : auto speedups     1.0709(G)     1.0985(G)     1.0580(G)     1.1653(G) speedups     0.8874(G)     0.8594(d)     0.9197(G)     0.8933(d) 
1184 [    ge    max   uint16] : auto speedups     1.1310(G)     1.1629(G)     1.1495(G)     1.2075(G) speedups     0.8380(G)     0.8398(d)     0.8044(G)     0.9034(d) 
1185 [    ge    max    int32] : auto speedups     1.1518(G)     1.1798(G)     1.1566(G)     1.2508(G) speedups     0.9411(G)     0.8654(d)     0.9020(G)     0.9122(d) 
1186 [    ge    max   uint32] : auto speedups     1.1356(G)     1.1607(G)     1.1518(G)     1.2235(G) speedups     0.9163(G)     0.8655(d)     0.8801(G)     0.9158(d) 
1187 [    ge    max    int64] : auto speedups     1.0933(G)     1.1181(G)     1.1295(G)     1.1800(G) speedups     0.9182(G)     0.8707(d)     0.9130(G)     0.9029(d) 
1188 [    ge    max   uint64] : auto speedups     1.0942(G)     1.1206(G)     1.1143(G)     1.1982(G) speedups     0.9153(G)     0.8252(d)     0.9029(G)     0.8647(d) 
1189 [    ge    max   single] : auto speedups     1.0486(G)     1.0821(G)     1.0617(G)     1.1667(G) speedups     1.0884(G)     1.0083(d)     1.0803(G)     1.0561(d) 
1190 [    ge    max   double] : auto speedups     1.0181(G)     1.0567(G)     1.0433(G)     1.1226(G) speedups     0.9330(G)     2.3234(d)     0.8973(G)     2.3811(d) 
1191 [    ge   plus  logical] : auto speedups     1.1464(G)     1.1434(G)     1.1682(G)     1.2232(G) speedups     0.8512(G)     1.0320(d)     0.8309(G)     1.0827(d) 
1192 [    ge   plus     int8] : auto speedups     1.1223(G)     1.1539(G)     1.1757(G)     1.2277(G) speedups     0.9272(G)     0.8205(d)     0.8991(G)     0.8950(d) 
1193 [    ge   plus    uint8] : auto speedups     1.1385(G)     1.1571(G)     1.1672(G)     1.1979(G) speedups     0.9149(G)     0.8526(d)     0.8833(G)     0.8628(d) 
1194 [    ge   plus    int16] : auto speedups     1.1360(G)     1.1454(G)     1.1478(G)     1.2210(G) speedups     0.9457(G)     0.8570(d)     0.9341(G)     0.9087(d) 
1195 [    ge   plus   uint16] : auto speedups     1.1476(G)     1.1649(G)     1.1459(G)     1.2336(G) speedups     0.8235(G)     0.8417(d)     0.8148(G)     0.8946(d) 
1196 [    ge   plus    int32] : auto speedups     1.1591(G)     1.1632(G)     1.1470(G)     1.2250(G) speedups     0.9381(G)     0.8682(d)     0.9160(G)     0.9147(d) 
1197 [    ge   plus   uint32] : auto speedups     1.1471(G)     1.1570(G)     1.1595(G)     1.2254(G) speedups     0.9048(G)     0.8593(d)     0.8883(G)     0.9082(d) 
1198 [    ge   plus    int64] : auto speedups     1.0990(G)     1.1286(G)     1.0929(G)     1.1910(G) speedups     0.9396(G)     0.8596(d)     0.9085(G)     0.9144(d) 
1199 [    ge   plus   uint64] : auto speedups     1.0981(G)     1.1317(G)     1.1027(G)     1.1816(G) speedups     0.9240(G)     0.8040(d)     0.8918(G)     0.8670(d) 
1200 [    ge   plus   single] : auto speedups     1.0530(G)     1.0680(G)     1.0893(G)     1.1439(G) speedups     1.1025(G)     0.9843(d)     1.0511(G)     1.0604(d) 
1201 [    ge   plus   double] : auto speedups     1.0250(G)     1.0705(G)     1.0509(G)     1.0839(G) speedups     0.9349(G)     2.1909(d)     0.9165(G)     2.2614(d) 
1202 [    ge  times  logical] : auto speedups     1.0852(G)     1.0643(G)     1.1074(G)     1.1317(G) speedups     0.8375(G)     1.0162(d)     0.8273(G)     1.0327(d) 
1203 [    ge  times     int8] : auto speedups     1.0830(G)     1.0425(G)     1.0953(G)     1.1434(G) speedups     0.8415(G)     0.8228(d)     0.8343(G)     0.8692(d) 
1204 [    ge  times    uint8] : auto speedups     1.0718(G)     1.0901(G)     1.0751(G)     1.1306(G) speedups     0.8376(G)     0.8148(d)     0.8136(G)     0.8631(d) 
1205 [    ge  times    int16] : auto speedups     1.0434(G)     1.0863(G)     1.0763(G)     1.1546(G) speedups     0.8393(G)     0.8217(d)     0.8209(G)     0.8856(d) 
1206 [    ge  times   uint16] : auto speedups     1.0619(G)     1.0717(G)     1.0734(G)     1.1370(G) speedups     0.9236(G)     0.8335(d)     0.8974(G)     0.8799(d) 
1207 [    ge  times    int32] : auto speedups     1.0636(G)     1.0876(G)     1.0722(G)     1.1431(G) speedups     0.9361(G)     0.8422(d)     0.9210(G)     0.8835(d) 
1208 [    ge  times   uint32] : auto speedups     1.0570(G)     1.0860(G)     1.0755(G)     1.1377(G) speedups     0.9370(G)     0.8411(d)     0.9175(G)     0.8880(d) 
1209 [    ge  times    int64] : auto speedups     1.0285(G)     1.0260(G)     1.0510(G)     1.0957(G) speedups     0.8856(G)     0.8347(d)     0.8740(G)     0.8756(d) 
1210 [    ge  times   uint64] : auto speedups     1.0210(G)     1.0521(G)     0.9988(G)     1.1106(G) speedups     0.8787(G)     0.7928(d)     0.8727(G)     0.8306(d) 
1211 [    ge  times   single] : auto speedups     1.0596(G)     1.0800(G)     1.0672(G)     1.1407(G) speedups     0.9377(G)     0.9321(d)     0.8861(G)     1.0028(d) 
1212 [    ge  times   double] : auto speedups     1.0338(G)     1.0247(G)     1.0336(G)     1.0897(G) speedups     0.9457(G)     2.2081(d)     0.9181(G)     2.2221(d) 
1213 [    ge     or  logical] : auto speedups     1.1390(G)     1.1781(G)     1.1254(G)     1.2241(G) speedups     0.8594(G)     1.0254(d)     0.8240(G)     1.0667(d) 
1214 [    ge     or     int8] : auto speedups     1.1539(G)     1.1444(G)     1.1409(G)     1.2300(G) speedups     0.9272(G)     0.8309(d)     0.9062(G)     0.8957(d) 
1215 [    ge     or    uint8] : auto speedups     1.1346(G)     1.1579(G)     1.1556(G)     1.2091(G) speedups     0.9029(G)     0.8350(d)     0.8949(G)     0.8756(d) 
1216 [    ge     or    int16] : auto speedups     1.1367(G)     1.1625(G)     1.1397(G)     1.2189(G) speedups     0.9521(G)     0.8506(d)     0.9473(G)     0.9022(d) 
1217 [    ge     or   uint16] : auto speedups     1.1364(G)     1.1618(G)     1.1390(G)     1.1076(G) speedups     0.8120(G)     0.8136(d)     0.7973(G)     0.9005(d) 
1218 [    ge     or    int32] : auto speedups     1.1471(G)     1.1658(G)     1.1522(G)     1.2314(G) speedups     0.9317(G)     0.8478(d)     0.9122(G)     0.9101(d) 
1219 [    ge     or   uint32] : auto speedups     1.1410(G)     1.1492(G)     1.1620(G)     1.2283(G) speedups     0.8986(G)     0.8639(d)     0.8778(G)     0.9202(d) 
1220 [    ge     or    int64] : auto speedups     1.1058(G)     1.1109(G)     1.1110(G)     1.1736(G) speedups     0.9167(G)     0.8716(d)     0.8977(G)     0.9085(d) 
1221 [    ge     or   uint64] : auto speedups     1.1154(G)     1.1226(G)     1.0871(G)     1.1746(G) speedups     0.9342(G)     0.8090(d)     0.9049(G)     0.8651(d) 
1222 [    ge     or   single] : auto speedups     1.0400(G)     1.0953(G)     1.0492(G)     1.1639(G) speedups     1.0937(G)     1.0015(d)     1.0801(G)     1.0464(d) 
1223 [    ge     or   double] : auto speedups     1.0200(G)     1.0591(G)     1.0206(G)     1.1119(G) speedups     0.9276(G)     2.2497(d)     0.8833(G)     2.3539(d) 
1224 [    ge    and  logical] : auto speedups     1.0639(G)     1.0957(G)     1.0720(G)     1.1513(G) speedups     0.8494(G)     0.9840(d)     0.8230(G)     1.0654(d) 
1225 [    ge    and     int8] : auto speedups     1.0491(G)     1.0737(G)     1.0680(G)     1.1438(G) speedups     0.8206(G)     0.7513(d)     0.7918(G)     0.8500(d) 
1226 [    ge    and    uint8] : auto speedups     1.0161(G)     1.0909(G)     1.0655(G)     1.1544(G) speedups     0.8370(G)     0.8181(d)     0.8149(G)     0.8698(d) 
1227 [    ge    and    int16] : auto speedups     1.0482(G)     1.0646(G)     1.0894(G)     1.1337(G) speedups     0.8405(G)     0.8312(d)     0.8109(G)     0.8699(d) 
1228 [    ge    and   uint16] : auto speedups     1.0450(G)     1.0777(G)     1.0508(G)     1.1378(G) speedups     0.9195(G)     0.8186(d)     0.8928(G)     0.8851(d) 
1229 [    ge    and    int32] : auto speedups     1.0465(G)     1.0749(G)     1.0737(G)     1.1355(G) speedups     0.9396(G)     0.8349(d)     0.9095(G)     0.8736(d) 
1230 [    ge    and   uint32] : auto speedups     1.0645(G)     1.0640(G)     1.0637(G)     1.1523(G) speedups     0.9436(G)     0.8478(d)     0.9161(G)     0.8769(d) 
1231 [    ge    and    int64] : auto speedups     1.0231(G)     1.0479(G)     1.0383(G)     1.0891(G) speedups     0.8903(G)     0.8275(d)     0.8593(G)     0.8340(d) 
1232 [    ge    and   uint64] : auto speedups     0.9644(G)     1.0292(G)     0.9688(G)     1.0959(G) speedups     0.9045(G)     0.7859(d)     0.8805(G)     0.8414(d) 
1233 [    ge    and   single] : auto speedups     1.0524(G)     1.0772(G)     1.0884(G)     1.1401(G) speedups     0.9402(G)     1.0114(d)     0.9050(G)     1.0654(d) 
1234 [    ge    and   double] : auto speedups     1.0434(G)     1.0520(G)     1.0493(G)     1.1246(G) speedups     0.9482(G)     2.2717(d)     0.9250(G)     2.4198(d) 
1235 [    ge    xor  logical] : auto speedups     1.0655(G)     1.0632(G)     1.0710(G)     1.1626(G) speedups     0.9007(G)     1.0184(d)     0.8742(G)     1.0619(d) 
1236 [    ge    xor     int8] : auto speedups     1.0825(G)     1.0829(G)     1.0846(G)     1.1621(G) speedups     0.8912(G)     0.8762(d)     0.8773(G)     0.9031(d) 
1237 [    ge    xor    uint8] : auto speedups     1.0753(G)     1.0973(G)     1.0612(G)     1.1591(G) speedups     0.8929(G)     0.8567(d)     0.8843(G)     0.9003(d) 
1238 [    ge    xor    int16] : auto speedups     1.0509(G)     1.0926(G)     1.0757(G)     1.1465(G) speedups     0.9489(G)     0.8723(d)     0.9217(G)     0.9066(d) 
1239 [    ge    xor   uint16] : auto speedups     1.0608(G)     1.0507(G)     1.0818(G)     1.1495(G) speedups     0.8834(G)     0.8567(d)     0.8707(G)     0.8961(d) 
1240 [    ge    xor    int32] : auto speedups     1.0613(G)     1.0874(G)     1.0785(G)     1.1307(G) speedups     0.8894(G)     0.8773(d)     0.8739(G)     0.9091(d) 
1241 [    ge    xor   uint32] : auto speedups     1.0533(G)     1.0918(G)     1.0644(G)     1.1475(G) speedups     0.9018(G)     0.8767(d)     0.8825(G)     0.9073(d) 
1242 [    ge    xor    int64] : auto speedups     1.0115(G)     1.0408(G)     1.0292(G)     1.0892(G) speedups     0.8865(G)     0.8747(d)     0.8541(G)     0.9237(d) 
1243 [    ge    xor   uint64] : auto speedups     1.0094(G)     1.0312(G)     1.0408(G)     1.1024(G) speedups     0.9450(G)     0.8241(d)     0.9239(G)     0.8769(d) 
1244 [    ge    xor   single] : auto speedups     1.0408(G)     1.0456(G)     1.0595(G)     1.1291(G) speedups     0.8820(G)     1.0228(d)     0.8604(G)     1.0651(d) 
1245 [    ge    xor   double] : auto speedups     1.0159(G)     1.0329(G)     1.0163(G)     1.0629(G) speedups     0.8924(G)     2.3068(d)     0.8653(G)     2.3333(d) 
1246 [    ge     eq  logical] : auto speedups     1.0577(G)     1.1093(G)     1.1164(G)     1.1414(G) speedups     0.8964(G)     1.0249(d)     0.8737(G)     1.0507(d) 
1247 [    ge     eq     int8] : auto speedups     1.0958(G)     1.1119(G)     1.0839(G)     1.1618(G) speedups     1.0120(G)     0.8669(d)     0.9772(G)     0.8986(d) 
1248 [    ge     eq    uint8] : auto speedups     1.0936(G)     1.1096(G)     1.1027(G)     1.1449(G) speedups     0.8752(G)     0.8566(d)     0.8704(G)     0.8935(d) 
1249 [    ge     eq    int16] : auto speedups     1.0567(G)     1.1042(G)     1.0852(G)     1.1680(G) speedups     0.8736(G)     0.8516(d)     0.8647(G)     0.8974(d) 
1250 [    ge     eq   uint16] : auto speedups     1.0692(G)     1.1130(G)     1.0957(G)     1.1569(G) speedups     0.8797(G)     0.8617(d)     0.8582(G)     0.9081(d) 
1251 [    ge     eq    int32] : auto speedups     1.0675(G)     1.0917(G)     1.0775(G)     1.1356(G) speedups     0.8898(G)     0.8828(d)     0.8816(G)     0.9166(d) 
1252 [    ge     eq   uint32] : auto speedups     1.0707(G)     1.1056(G)     1.0796(G)     1.1647(G) speedups     0.9649(G)     0.8726(d)     0.9482(G)     0.9230(d) 
1253 [    ge     eq    int64] : auto speedups     1.0328(G)     1.0701(G)     1.0501(G)     1.1012(G) speedups     0.8907(G)     0.8915(d)     0.8704(G)     0.9087(d) 
1254 [    ge     eq   uint64] : auto speedups     1.0511(G)     1.0502(G)     1.0201(G)     1.1159(G) speedups     0.9342(G)     0.8208(d)     0.9195(G)     0.8619(d) 
1255 [    ge     eq   single] : auto speedups     1.0320(G)     1.0776(G)     1.0254(G)     1.1213(G) speedups     0.9610(G)     0.9890(d)     0.9126(G)     1.0592(d) 
1256 [    ge     eq   double] : auto speedups     1.0220(G)     1.0259(G)     1.0143(G)     1.1018(G) speedups     0.9505(G)     2.2720(d)     0.9124(G)     2.3136(d) 
1257 [    le    min  logical] : auto speedups     1.0379(G)     1.0665(G)     1.0594(G)     1.1371(G) speedups     0.8882(G)     0.9986(d)     0.8806(G)     1.0576(d) 
1258 [    le    min     int8] : auto speedups     1.0595(G)     1.0966(G)     1.0783(G)     1.1626(G) speedups     0.8358(G)     0.8231(d)     0.8203(G)     0.8670(d) 
1259 [    le    min    uint8] : auto speedups     1.0408(G)     1.0629(G)     1.0655(G)     1.1347(G) speedups     0.8496(G)     0.8232(d)     0.8241(G)     0.8713(d) 
1260 [    le    min    int16] : auto speedups     1.0599(G)     1.0905(G)     1.0832(G)     1.1523(G) speedups     0.8637(G)     0.8351(d)     0.8544(G)     0.8726(d) 
1261 [    le    min   uint16] : auto speedups     1.0521(G)     1.0573(G)     1.0451(G)     1.1065(G) speedups     0.9035(G)     0.8265(d)     0.8971(G)     0.8794(d) 
1262 [    le    min    int32] : auto speedups     1.0608(G)     1.0758(G)     1.0788(G)     1.1307(G) speedups     0.9399(G)     0.8012(d)     0.8862(G)     0.8544(d) 
1263 [    le    min   uint32] : auto speedups     0.9626(G)     0.9733(G)     1.0547(G)     1.1172(G) speedups     0.9484(G)     0.8332(d)     0.9278(G)     0.8930(d) 
1264 [    le    min    int64] : auto speedups     1.0333(G)     1.0351(G)     1.0414(G)     1.1115(G) speedups     0.8871(G)     0.8411(d)     0.8715(G)     0.8860(d) 
1265 [    le    min   uint64] : auto speedups     1.0116(G)     1.0318(G)     1.0082(G)     1.0500(G) speedups     0.8957(G)     0.7932(d)     0.8716(G)     0.8265(d) 
1266 [    le    min   single] : auto speedups     1.0679(G)     1.0971(G)     1.0842(G)     1.1538(G) speedups     0.8777(G)     1.0229(d)     0.8591(G)     1.0608(d) 
1267 [    le    min   double] : auto speedups     1.0474(G)     1.0619(G)     1.0530(G)     1.1146(G) speedups     0.8273(G)     2.3941(d)     0.8031(G)     2.5013(d) 
1268 [    le    max  logical] : auto speedups     1.1451(G)     1.1732(G)     1.1421(G)     1.2231(G) speedups     0.9228(G)     1.0107(d)     0.8856(G)     1.0437(d) 
1269 [    le    max     int8] : auto speedups     1.1219(G)     1.1826(G)     1.1464(G)     1.2392(G) speedups     0.8459(G)     0.8439(d)     0.8365(G)     0.9016(d) 
1270 [    le    max    uint8] : auto speedups     1.1091(G)     1.1424(G)     1.1072(G)     1.2176(G) speedups     0.8300(G)     0.8594(d)     0.8091(G)     0.8926(d) 
1271 [    le    max    int16] : auto speedups     1.1398(G)     1.1485(G)     1.1599(G)     1.2396(G) speedups     0.8279(G)     0.8655(d)     0.8108(G)     0.9065(d) 
1272 [    le    max   uint16] : auto speedups     1.1187(G)     1.1416(G)     1.1442(G)     1.1899(G) speedups     0.9496(G)     0.8742(d)     0.9172(G)     0.9124(d) 
1273 [    le    max    int32] : auto speedups     1.1668(G)     1.1838(G)     1.1428(G)     1.2470(G) speedups     0.9105(G)     0.8764(d)     0.8703(G)     0.9119(d) 
1274 [    le    max   uint32] : auto speedups     1.1335(G)     1.1249(G)     1.1257(G)     1.2025(G) speedups     0.9339(G)     0.8616(d)     0.9225(G)     0.9063(d) 
1275 [    le    max    int64] : auto speedups     1.1023(G)     1.0981(G)     1.1227(G)     1.1744(G) speedups     0.9097(G)     0.8745(d)     0.8838(G)     0.9070(d) 
1276 [    le    max   uint64] : auto speedups     1.0739(G)     1.0972(G)     1.0770(G)     1.1151(G) speedups     0.8941(G)     0.8197(d)     0.8709(G)     0.8473(d) 
1277 [    le    max   single] : auto speedups     1.0660(G)     1.0766(G)     1.0743(G)     1.1203(G) speedups     0.9523(G)     1.0208(d)     0.9338(G)     1.0640(d) 
1278 [    le    max   double] : auto speedups     1.0312(G)     1.0355(G)     1.0274(G)     1.1092(G) speedups     0.8254(G)     2.3492(d)     0.8122(G)     2.4565(d) 
1279 [    le   plus  logical] : auto speedups     1.1198(G)     1.1661(G)     1.1271(G)     1.2096(G) speedups     0.9176(G)     1.0158(d)     0.8978(G)     1.0718(d) 
1280 [    le   plus     int8] : auto speedups     1.1429(G)     1.1695(G)     1.1582(G)     1.2218(G) speedups     0.8541(G)     0.8435(d)     0.8389(G)     0.8973(d) 
1281 [    le   plus    uint8] : auto speedups     1.1002(G)     1.1438(G)     1.1310(G)     1.1862(G) speedups     0.8271(G)     0.8468(d)     0.8139(G)     0.8915(d) 
1282 [    le   plus    int16] : auto speedups     1.1420(G)     1.1660(G)     1.1580(G)     1.2256(G) speedups     0.8273(G)     0.8695(d)     0.8184(G)     0.9086(d) 
1283 [    le   plus   uint16] : auto speedups     1.0845(G)     1.1344(G)     1.1292(G)     1.1876(G) speedups     0.9361(G)     0.8689(d)     0.9060(G)     0.9137(d) 
1284 [    le   plus    int32] : auto speedups     1.1473(G)     1.1707(G)     1.1490(G)     1.2305(G) speedups     0.9059(G)     0.8597(d)     0.8806(G)     0.9082(d) 
1285 [    le   plus   uint32] : auto speedups     1.1279(G)     1.1351(G)     1.1350(G)     1.1903(G) speedups     0.9097(G)     0.8542(d)     0.9137(G)     0.9098(d) 
1286 [    le   plus    int64] : auto speedups     1.0956(G)     1.1183(G)     1.0831(G)     1.1767(G) speedups     0.8977(G)     0.8581(d)     0.8741(G)     0.9085(d) 
1287 [    le   plus   uint64] : auto speedups     1.0725(G)     1.0865(G)     1.0807(G)     1.1548(G) speedups     0.8854(G)     0.8185(d)     0.8688(G)     0.8644(d) 
1288 [    le   plus   single] : auto speedups     1.0636(G)     1.0877(G)     1.0833(G)     1.1394(G) speedups     0.9473(G)     1.0088(d)     0.9269(G)     1.0624(d) 
1289 [    le   plus   double] : auto speedups     1.0436(G)     1.0439(G)     1.0251(G)     1.1093(G) speedups     0.8171(G)     2.3138(d)     0.8135(G)     2.4144(d) 
1290 [    le  times  logical] : auto speedups     1.0499(G)     1.0772(G)     1.0558(G)     1.1400(G) speedups     0.8944(G)     1.0067(d)     0.8792(G)     1.0687(d) 
1291 [    le  times     int8] : auto speedups     1.0769(G)     1.0820(G)     1.0813(G)     1.1610(G) speedups     0.8283(G)     0.8239(d)     0.8223(G)     0.8689(d) 
1292 [    le  times    uint8] : auto speedups     1.0604(G)     1.0706(G)     1.0573(G)     1.1380(G) speedups     0.8457(G)     0.8201(d)     0.8281(G)     0.8685(d) 
1293 [    le  times    int16] : auto speedups     1.0637(G)     0.9340(G)     0.9967(G)     1.0209(G) speedups     0.8138(G)     0.7453(d)     0.7385(G)     0.6588(d) 
1294 [    le  times   uint16] : auto speedups     0.9168(G)     0.9637(G)     0.9184(G)     0.9662(G) speedups     0.8011(G)     0.5397(d)     0.8093(G)     0.6552(d) 
1295 [    le  times    int32] : auto speedups     0.9785(G)     0.9556(G)     0.9496(G)     1.0183(G) speedups     0.8383(G)     0.7187(d)     0.8411(G)     0.7968(d) 
1296 [    le  times   uint32] : auto speedups     0.9706(G)     0.9697(G)     0.9594(G)     1.0156(G) speedups     0.8386(G)     0.5934(d)     0.8566(G)     0.6592(d) 
1297 [    le  times    int64] : auto speedups     0.9307(G)     0.9715(G)     0.6439(G)     0.7571(G) speedups     0.8224(G)     0.6948(d)     0.7986(G)     0.7836(d) 
1298 [    le  times   uint64] : auto speedups     0.9082(G)     0.9006(G)     0.8710(G)     0.9941(G) speedups     0.8002(G)     0.7238(d)     0.7977(G)     0.7132(d) 
1299 [    le  times   single] : auto speedups     0.7381(G)     0.8272(G)     1.0363(G)     1.0563(G) speedups     0.8098(G)     0.9315(d)     0.7825(G)     0.9019(d) 
1300 [    le  times   double] : auto speedups     0.7836(G)     0.6334(G)     0.6238(G)     0.8964(G) speedups     0.6610(G)     1.9154(d)     0.6405(G)     2.3271(d) 
1301 [    le     or  logical] : auto speedups     0.9810(G)     1.0720(G)     1.0399(G)     1.0805(G) speedups     0.8376(G)     0.9067(d)     0.8351(G)     0.9689(d) 
1302 [    le     or     int8] : auto speedups     0.7658(G)     1.0544(G)     0.6322(G)     0.8091(G) speedups     0.4438(G)     0.6708(d)     0.7777(G)     0.8226(d) 
1303 [    le     or    uint8] : auto speedups     1.0553(G)     1.0515(G)     1.0459(G)     1.0389(G) speedups     0.7433(G)     0.7665(d)     0.7592(G)     0.6226(d) 
1304 [    le     or    int16] : auto speedups     1.0509(G)     1.0922(G)     1.0965(G)     1.1637(G) speedups     0.8007(G)     0.8185(d)     0.7769(G)     0.8520(d) 
1305 [    le     or   uint16] : auto speedups     1.0519(G)     1.0369(G)     1.0586(G)     1.1378(G) speedups     0.8717(G)     0.8190(d)     0.8739(G)     0.8573(d) 
1306 [    le     or    int32] : auto speedups     1.1087(G)     1.1165(G)     1.1125(G)     1.1635(G) speedups     0.8526(G)     0.7905(d)     0.8234(G)     0.8355(d) 
1307 [    le     or   uint32] : auto speedups     1.0169(G)     1.0660(G)     0.9924(G)     1.1082(G) speedups     0.8838(G)     0.7667(d)     0.8473(G)     0.8254(d) 
1308 [    le     or    int64] : auto speedups     1.0134(G)     1.0441(G)     1.0027(G)     1.0424(G) speedups     0.8487(G)     0.8082(d)     0.8131(G)     0.8555(d) 
1309 [    le     or   uint64] : auto speedups     1.0042(G)     1.0218(G)     1.0023(G)     1.0664(G) speedups     0.8330(G)     0.7618(d)     0.8241(G)     0.8154(d) 
1310 [    le     or   single] : auto speedups     1.0142(G)     1.0501(G)     1.0214(G)     1.1150(G) speedups     0.9020(G)     0.9494(d)     0.8810(G)     1.0172(d) 
1311 [    le     or   double] : auto speedups     0.9721(G)     0.9814(G)     0.9693(G)     1.0545(G) speedups     0.7840(G)     2.1315(d)     0.7582(G)     2.2540(d) 
1312 [    le    and  logical] : auto speedups     0.9996(G)     1.0260(G)     1.0219(G)     1.0826(G) speedups     0.8620(G)     0.9547(d)     0.8375(G)     1.0040(d) 
1313 [    le    and     int8] : auto speedups     1.0246(G)     1.0305(G)     1.0369(G)     1.0861(G) speedups     0.7983(G)     0.7779(d)     0.7690(G)     0.8309(d) 
1314 [    le    and    uint8] : auto speedups     0.9999(G)     1.0097(G)     1.0048(G)     1.0742(G) speedups     0.8102(G)     0.7777(d)     0.7960(G)     0.8089(d) 
1315 [    le    and    int16] : auto speedups     1.0052(G)     1.0260(G)     1.0239(G)     1.0882(G) speedups     0.8259(G)     0.7930(d)     0.8081(G)     0.8294(d) 
1316 [    le    and   uint16] : auto speedups     0.9886(G)     1.0139(G)     0.9977(G)     1.0696(G) speedups     0.8744(G)     0.7849(d)     0.8502(G)     0.8266(d) 
1317 [    le    and    int32] : auto speedups     1.0145(G)     1.0382(G)     1.0144(G)     1.0995(G) speedups     0.8920(G)     0.7924(d)     0.8643(G)     0.8431(d) 
1318 [    le    and   uint32] : auto speedups     1.0071(G)     0.9917(G)     0.9874(G)     1.0702(G) speedups     0.8965(G)     0.7898(d)     0.8805(G)     0.8375(d) 
1319 [    le    and    int64] : auto speedups     0.9933(G)     0.9823(G)     0.9919(G)     1.0315(G) speedups     0.8474(G)     0.8000(d)     0.8231(G)     0.8385(d) 
1320 [    le    and   uint64] : auto speedups     0.9696(G)     0.9672(G)     0.9399(G)     1.0472(G) speedups     0.8566(G)     0.7483(d)     0.8262(G)     0.7908(d) 
1321 [    le    and   single] : auto speedups     1.0170(G)     1.0277(G)     1.0259(G)     1.0918(G) speedups     0.8318(G)     0.9633(d)     0.8166(G)     1.0188(d) 
1322 [    le    and   double] : auto speedups     1.0115(G)     1.0064(G)     0.9835(G)     1.0468(G) speedups     0.7788(G)     2.2876(d)     0.7592(G)     2.2581(d) 
1323 [    le    xor  logical] : auto speedups     0.9971(G)     1.0105(G)     1.0126(G)     1.0538(G) speedups     0.8594(G)     0.9506(d)     0.8501(G)     0.9979(d) 
1324 [    le    xor     int8] : auto speedups     1.0235(G)     1.0434(G)     1.0183(G)     1.0956(G) speedups     0.8644(G)     0.7956(d)     0.8463(G)     0.8696(d) 
1325 [    le    xor    uint8] : auto speedups     0.9895(G)     0.9807(G)     1.0124(G)     1.0623(G) speedups     0.8461(G)     0.8196(d)     0.8277(G)     0.8446(d) 
1326 [    le    xor    int16] : auto speedups     1.0051(G)     1.0291(G)     0.9906(G)     1.0917(G) speedups     0.9018(G)     0.8301(d)     0.8836(G)     0.8541(d) 
1327 [    le    xor   uint16] : auto speedups     1.0006(G)     1.0076(G)     1.0035(G)     1.0624(G) speedups     0.8289(G)     0.8193(d)     0.8210(G)     0.8695(d) 
1328 [    le    xor    int32] : auto speedups     1.0154(G)     1.0327(G)     1.0152(G)     1.0830(G) speedups     0.8689(G)     0.8263(d)     0.8424(G)     0.8776(d) 
1329 [    le    xor   uint32] : auto speedups     0.9965(G)     0.9907(G)     0.9756(G)     1.0667(G) speedups     0.8483(G)     0.8162(d)     0.8367(G)     0.8719(d) 
1330 [    le    xor    int64] : auto speedups     0.9795(G)     0.9792(G)     0.9671(G)     1.0591(G) speedups     0.8379(G)     0.8279(d)     0.8154(G)     0.8643(d) 
1331 [    le    xor   uint64] : auto speedups     0.9517(G)     0.9566(G)     0.9604(G)     1.0260(G) speedups     0.8566(G)     0.6489(d)     0.7719(G)     0.7083(d) 
1332 [    le    xor   single] : auto speedups     0.9482(G)     1.0072(G)     0.9877(G)     1.0673(G) speedups     0.8727(G)     0.9649(d)     0.8642(G)     1.0127(d) 
1333 [    le    xor   double] : auto speedups     0.9783(G)     0.9851(G)     0.9783(G)     1.0460(G) speedups     0.8938(G)     2.0241(d)     0.8813(G)     2.0637(d) 
1334 [    le     eq  logical] : auto speedups     1.0226(G)     1.0222(G)     0.9637(G)     1.0412(G) speedups     0.7783(G)     0.8684(d)     0.7661(G)     0.9273(d) 
1335 [    le     eq     int8] : auto speedups     0.9709(G)     1.0228(G)     0.9995(G)     1.1215(G) speedups     0.8515(G)     0.8112(d)     0.8357(G)     0.8524(d) 
1336 [    le     eq    uint8] : auto speedups     1.0218(G)     1.0190(G)     1.0318(G)     1.0794(G) speedups     0.8716(G)     0.8138(d)     0.8542(G)     0.8512(d) 
1337 [    le     eq    int16] : auto speedups     1.0371(G)     1.0421(G)     1.0352(G)     1.0847(G) speedups     0.9121(G)     0.8163(d)     0.8733(G)     0.8525(d) 
1338 [    le     eq   uint16] : auto speedups     1.0110(G)     0.9826(G)     1.0001(G)     1.0903(G) speedups     0.9165(G)     0.8047(d)     0.9147(G)     0.8372(d) 
1339 [    le     eq    int32] : auto speedups     1.0118(G)     1.0220(G)     1.0376(G)     1.0920(G) speedups     0.9339(G)     0.8197(d)     0.8860(G)     0.8715(d) 
1340 [    le     eq   uint32] : auto speedups     1.0075(G)     1.0051(G)     1.0210(G)     1.0681(G) speedups     0.8439(G)     0.8122(d)     0.8441(G)     0.8363(d) 
1341 [    le     eq    int64] : auto speedups     0.9887(G)     1.0031(G)     1.0024(G)     1.0305(G) speedups     0.9052(G)     0.8124(d)     0.8663(G)     0.8666(d) 
1342 [    le     eq   uint64] : auto speedups     0.9887(G)     0.9569(G)     0.9613(G)     1.0376(G) speedups     0.8251(G)     0.7554(d)     0.8181(G)     0.7972(d) 
1343 [    le     eq   single] : auto speedups     1.0004(G)     1.0040(G)     1.0138(G)     1.0917(G) speedups     0.8075(G)     0.9471(d)     0.7893(G)     0.9840(d) 
1344 [    le     eq   double] : auto speedups     0.9515(G)     0.9801(G)     0.9701(G)     1.0094(G) speedups     0.9079(G)     2.0940(d)     0.8874(G)     2.1286(d) 

test06: all tests passed

testperf:  all tests passed.  Total time 43874.9

