graphics cards also have a memory bandwith issue at the memory bus controller. it is bottlenecking the computation. the memory bus controller process not enough bits. they shoud atleast process 512Bits at a time. current top line RTX 3090 proccesess 384 Bits.
you can see the comparsion of the same graphics cards architecture (pic related) just with the difference to have LESS Computation Cores and a HIGHER Memory Bus Speed. also it's clocked less, it means less data is transported and computated. it's in boost just 15Mhz and in loads it is mostly in boost modus, but you can see the benchmarking. the A5000 is faster in nearly all computations while having less computation power, but just a bigger memory bus.
https://www.youtube.com/watch?v=xtN9dX0oWzY