2.7.1 Troubleshooting
Learn to use a systematic troubleshooting process.
Troubleshooting Methodology
CompTIA+ Defined Standards:
- identify the problem
- establish theory of probable cause
- test the theory to determine the cause
- establish plan of action & implement solution
- verify system functionality + implement preventative measures
- document findings
Gather Information
Determine exactly what happened. Ask the user that reported the issue for the symptoms and error messages. Attempt to recreate the problem, which might make problem-solving faster. Also find out whether the problem is localized, or if other users on the network are experiencing the same issue. Review tickets and logs, or test other systems for the issue.
- Gather information from log files and error messages
- Question users
- Identify symptoms
- Determine recent changes
- Duplicate the problem
- Approach multiple problems at the same time
- Narrow the scope of the problem.
Identify Changes
Find out whether any changes were made to the system prior to the problem arising. Did they:
- add a new device?
- install a new program?
- visit a new (possibly unauthorized) website? Another possibility is that changes to the environment or network infrastructure are limiting resources, so check for that too.
- Question the obvious
- Consider multiple approaches
- top to bottom or bottom to top for layered technologies (like networks)
Establish a Theory
Before attempting any changes, make sure that all the data on the system is backed up and saved properly, then you can establish a theory of probable cause. You can now check for the simple and obvious problems and their solutions. If your theory doesn't hold up, return to the previous step and try a new solution. If you don't have the required permissions or have already exhausted your own resources, you might have to escalate the issue. If the problem ends up being too complex, you might have to create an action plan.
Create an Action Plan
Consider the fix along with it's side effects.
- Will the fix result in significant system downtime?
- Is the resolution best left for slower times of the day?
- Is there a temporary solution that you can implement immediately? Make sure you get any needed approval from those who are going to be impacted the most. Make sure to check vendor documentation for quick tips, diagnostic tools, and time-saving tips. After implementing your solution, make sure that the problem is fully resolved and that there are no new issues as a result of the implementation of your solution.
Wrap-up
Document your findings. Provide a record of what the problem was and what you did to resolve it. Remember to explain to the customer what you did, and ensure their satisfaction. This'll help them understand and accept that the problem has really been solved, and may even provide them with information on how to resolve the issue themselves should it occur again
Tips
- Often, the hardest part of troubleshooting is to reproduce the problem. You might need to ask the user questions to identify exactly how the problem occurred, or you might need to watch them perform the task again to reproduce the problem.
- If a hardware device or a software program causes a specific error, check the manufacturer's website for additional help in troubleshooting the error.
- To help diagnose issues, you can run special software tools supplied by the hardware manufacturer.
- In addition to a basic toolkit, you can keep a few spare parts on hand that you know to be in working order. If you suspect that a component has failed, replace it with the known good spare. If that solves the problem, replace the faulty component.
- Intermittent problems are difficult to troubleshoot. Check for environmental conditions such as kinked cables or overheated components.
- If you have problems identifying a hardware error, you can simplify the system by removing all but necessary components (processor, memory, and hard disk). Add devices one at a time and restart the system. If an error occurs, remove the newly added device and troubleshoot that device. Another strategy would be to remove a single device and restart the system, seeing if removing that device corrects the problem.
- Some problems might be caused by software errors, not hardware failures. You might need to begin by updating the drivers or unloading software.
- Before you make changes, always consider corporate policies and procedures and the changes' impact on other people and components.
#II
#Aplus
No comments to display
No comments to display