Applying the correct troubleshooting steps to your problems is crucial:
a) Understand what resources have been saturated.
b) Understand what if anything is causing an error.
c) From there you can divert into the areas that are related to that resource and start to narrow down the issue.
d) Tackle the problems bit by bit.
Having the right tools for the job key for success. PMM is a great example of a tool that can help you quickly identify, drill in, and fix bottlenecks.
Have realistic load tests. In this case, they had tested the new release on a concurrency level that was not like their production
By identifying the culprit query we were able to:
a.) Drop average query time from 20s to 661ms
b.) Increase QPS by 2x
c.) Reduce the usage of CPU to 1/4 of its level prior to our intervention
First off – not trying to kindle any flame wars here, just trying to broaden my (your) horizons a bit, gather some ideas (maybe I’m missing out on something cool, it’s the most used Open Source RDBMS after all) and to somewhat compare the two despite being a difficult thing to do correctly / objectively. Also I’m leaving aside here performance comparisons and looking at just the available features, general querying experience and documentation clarity as this is I guess most important for beginners. So just a list of points I made for myself, grouped in no particular order.