I recently saw a great case study presentation by Cook Children’s Health Care System, on how their network team uses APM technology to improve application performance and provide better service to their patients and physicians. I’ve summarized the key take-aways below.
Cook Children’s is a nationally recognized pediatric care provider that operates a Medical Center, Pediatric Surgery Center, Home Health Company, Health Plan, and a Physicians Network of over 60 locations.
Challenges
Network manager, Ross Jones, shared the following challenges at OPNETWORK 2012:
- End users’ perception of “slowness”.
- Lack of application performance trending: resulted in problems not being identified until customers called in with complaints.
- Root cause analysis could often go on for days.
- Challenges to monitor their network due to: need for visibility into multiple 10G links, high quantities of duplicate packet data, and need to share SPAN traffic with multiple tools.
The Anue Net Tool Optimizer (NTO) was implemented between network access points to enhance network traffic visibility for the APM platform. OPNET AppResponse Xpert was selected to meet the APM requirements.
- APM – The OPNET APM Xpert suite enabled 24x7 application transaction monitoring, packet storage, and network analysis, while providing integrated software add-ons for dependency mapping, SNMP reporting, database monitoring, and pre-deployment application testing.
- Aggregation & Filtering - The Ixia’s Anue NTO was implemented to aggregate traffic from the multiple 10G SPANs and direct it to the OPNET appliance after filtering out duplicate packets.
The joint OPNET and Anue solution had an immediate impact on improving networked applications performance and reducing troubleshooting time as illustrated by these two examples.
- Proactively monitor EMR application - Before the APM deployment, the networking team would often not be aware of slow response time on the remotely hosted Electronic Medical Records (EMR) application until a physician would call in to complain. Once the problem was identified, it took time to sort out if the root cause of the issue was the application or the network. This issue was resolved by baselining and continuously monitoring end-user response time and WAN network latency. Thresholds were set using OPNET’s alert functionality so that the networking team can be aware of and correct potential issues before users complain. When problems do occur, MTTR is reduced by focused root cause analysis.
- Identifying source of Outlook/Exchange Problem - During a conversion from Outlook 2007 to 2010, users from different venues and desktops OS machines could not make Outlook client connections to Exchange. Utilizing OPNET’s trace data to contrast good connections with failed ones, the networking team was able to see that the client was connecting to the load balancer, but going no further. This information from the joint OPNET/Anue solution was vital to discovering an exhausted connection table in the load balancer which would not allow any new backend connections. Without the APM solution, a difficult to find problem like this would have taken much longer to diagnose and resolve.
Keeping up with monitoring traffic overload
The OPNET AppResponse Xpert analysis is only as good as the data it receives. The appliance required input from four 10G SPAN ports from Cook Children’s Cisco 6509 and Nexus 7000 routers. 50% of the traffic from these SPAN ports consisted of unneeded duplicate packets. Duplicate packets can decrease the processing efficiency of AppResponse Xpert, or, lead to less storage capacity available for useful packets (essential for “back in time” analysis). The Anue NTO was used to aggregate the four 10G links and remove all duplicate packets, providing the OPNET solution complete, efficient visibility. The Anue NTO also allowed sharing of traffic from the Cisco SPAN ports with Cook Children’s other packet-based tools, which also required network access.
Benefits
“The joint OPNET/Anue solution has cut the time it takes to diagnose performance problems from days to hours,” said Ross Jones, Network Manager at Cook Children’s Health.
- Faster troubleshooting by reducing problem isolation reduced from days to hours.
- Proactively seeing and resolving issues before they become problems.
- Complete network traffic visibility for application performance analysis.
- Improving efficiency of APM system by removing duplicate packets.
- No interference with other department's network probes thanks to SPAN/Tap sharing.
No comments:
Post a Comment