Pregunta de un test: «debugging a system issue»

Hace poco hice un test online. Había una sección que trataba sobre sugerir causas y soluciones de un problema en producción. El enunciado era tal que así:

Preguntaban sobre posibles causas y primeros pasos a seguir.

Mis repuestas a la primera pregunta
DDoS attack Debugging logs that write to disk accidentally activated Someone manually executing some slow command (SORT in Redis, for example...) Out of memory Congested network (maybe some sort of backup is running?) Hardware issue (for example, SAN firmware issues that slows down disk access, backup VMs)

Mis respuestas a la segunda pregunta
Execute sample query first to PostgreSQL, then to Redis, to quickly check if any or both are slow. Latency monitor on Redis: ./redis-cli --intrinsic-latency Check cpu and memory usage (top in case of physical machine or VM, or other tool like Grafana in case of containers). If there are a lot of production requests coming in, block them and check if times improve. Maybe increase socket buffer queue size in kernel.
Revisión recibida
Architecture debugging – brainstorming
- We liked that you came up with several ideas on what could be causing slow recommendations.
- We also liked that you provided distinct ideas around redis cache, hardware, and networking.
- Some top solutions grouped ideas by system or issue type. We find that this helps people consider a wider variety of potential issues.
- We liked solutions that indicated whether they thought one of the options was more or less likely than the others.
Architecture debugging – email a teammate
- We liked that you organized your recommendations into sections to improve readability and make discussion easier.
- Our favorite responses included a high-level summary of the issue to provide context.
- Some of the best responses provided a bit more detail for a junior teammate. What are some specific steps or commands they could take to track down a network issue? How could they test the API calls that are being made? (A few people even included a sample curl command or a link to documentation.)