Joel Liu's Library tagged → View Popular
24 Jan 08
Five whys - Joel on Software
-
After some internal discussion we all agreed that rather than imposing a
statistically meaningless measurement and hoping that the mere measurement of
something meaningless would cause it to get better, what we really needed was a
process of continuous improvement. Instead of setting up a SLA for our
customers, we set up a blog
where we would document every outage in real time, provide complete
post-mortems, ask the five whys, get to the root cause, and tell our customers
what we're doing to prevent that problem in the future. In this case, the change
is that our internal documentation will include detailed checklists
for all operational procedures in the live environment. -
- Our link to Peer1 NY went down
- Why? – Our switch appears to have put the port in a failed state
- Why? – After some discussion with the Peer1 NOC, we speculate that it was
quite possibly caused by an Ethernet speed / duplex mismatch - Why? – The switch interface was set to auto-negotiate instead of being
manually configured - Why? – We were fully aware of problems like this, and have been for many
years. But - we do not have a written standard and verification process
for production switch configurations. - Why? – Documentation is often thought of as an aid for when the sysadmin
isn’t around or for other members of the operations team, whereas, it should
really be thought of as a checklist.
- Our link to Peer1 NY went down
1 - 3 of 3
Showing 20▼ items per page
Top Contributors
Groups interested in mindset
Related Lists on Diigo
-
PERSONALITY TYPES & Thinking patterns in decision-making
by understanding dissimilar...
Items: 38 | Visits: 17
Created by: Ambika K
Diigo is about better ways to research, share and collaborate on information. Learn more »
Join Diigo
