Back in my previous company, we ran into issues where our MongoDB server became very slow and affected all our queries. Fortunately, we used Atlas and Profiler was available to us to analyse what was going on.
Here are some things we looked at.
High Operation Time
Recurring DB queries
Performance Advisor for recommended indexes (Though, it might not be useful every time)
Add Cache Layer
After looking at the metrics, we reduced recurring DB queries by adding a caching layer in between, since the data changed less frequently than the expensive queries we were making.
First, we added an index to a couple of fields which were actively being used in our pipelines.
Second, we improved the performance of the pipeline by moving our
$match filter state before the lookup to reduce the amount of lookups.
Finally, we used
$project to limit the amount of data we passed from one stage to another.
We also had to increase the CPU and RAM of our infra to handle the increased volume of queries being made to the DB as our last step.