Aviation Investigation Report A11O0031
The Transportation Safety Board of Canada (TSB) investigated this occurrence for the purpose of advancing transportation safety. It is not the function of the Board to assign fault or determine civil or criminal liability.
Erroneous air data indications
Sunwing Airlines Inc.
Boeing 737-8Q8, C-FTAH
Toronto–Lester B. Pearson International Airport
The Boeing 737-8Q8 (registration C-FTAH, serial number 29351) was operating as Sunwing Airlines flight 531 from Toronto–Lester B. Pearson International Airport in Toronto, Ontario, to Cozumel International Airport, Mexico, with 189 passengers and 7 crew members on board. During the take-off run, at about 90 knots indicated airspeed, the autothrottle disengaged after take-off thrust was set. As the aircraft approached the critical engine failure recognition speed, the first officer, who was the pilot flying, noticed an AIRSPEED DISAGREE alert and transferred control of the aircraft to the captain, who then continued the take-off. During the initial climb, the aircraft received a stall warning (stick shaker), followed by a flight director command to pitch to a 5° nose-down attitude. The take-off was being conducted in visual conditions, allowing the captain to determine that the flight director commands were erroneous. The captain ignored the flight director commands and maintained a climbing attitude. The crew advised the air traffic controller of a technical problem that required a return to Toronto. The crew did not declare an emergency, but requested that aircraft rescue and firefighting services be placed on standby due to the overweight landing. The occurrence took place at 0657 Eastern Daylight Time, during hours of darkness. The aircraft landed at 0723, during hours of civil twilight.
History of the flight
The flight was planned in such a way that the first officer (FO), occupying the right seat, was the pilot flying for the take-off, while the captain, occupying the left seat, was the pilot monitoring. The length of Runway 23 and the gross take-off weight allowed for a reduced-thrust take-off. According to the Boeing Flight Crew Operating Manual (FCOM), V1 was determined to be 149 knots indicated airspeed (KIAS).
V1 explained: Section 500.03 of the Canadian Aviation Regulations defines V1 as: "the maximum speed in the take-off at which the pilot must take the first action (e.g., apply brakes, reduce thrust, deploy speed brakes) to stop the aeroplane within the accelerate-stop distance. V1 also means the minimum speed in the take-off, following a failure of the critical engine at Vef at which the pilot can continue the take-off and achieve the required height above the take-off surface within the takeoff distance." The actual value of V1 varies depending primarily on the weight of the aircraft and the available runway length.
The cockpit was set up appropriately for the FO to be the pilot flying, including the selection of the FO's flight director as the master flight director. This means that it would provide information for both the captain's and FO's flight instrument displays, except for the take-off run and initial climb periods.1
The tower controller instructed Sunwing Airlines flight 531 (SWG 531) to line up on Runway 23, and the clearance was acknowledged. The tower controller subsequently cleared SWG 531 for take-off. SWG 531 transmitted what sounded like a carrier signal with an open microphone, without a discernible read-back of the clearance. Fifteen seconds later, the tower controller repeated the take-off clearance and again received a similar response.
The aircraft positioned itself on the runway for take-off at 0655.2 The crew activated the take-off and go-around (TOGA) switch. The thrust levers moved forward to the take-off setting under the control of the autothrottle system, and the flight director command bars on both pilots' primary flight displays (PFD) commanded a pitch attitude of 10° nose-down as designed. The take-off sequence continued as follows:
|Time from brake release (min:sec)||Event|
At 0656:49, the TOGA switch was depressed, the autothrottle positioned the thrust levers, the FO released the brakes and began the take-off run.
Indicated airspeed3 (Vi) = 60 knots – The captain's flight director commanded a 15° nose-up pitch attitude on the captain's PFD.
Vi= 80 knots – The captain called "80 knots". The FO observed less than 80 knots, and attributed the discrepancy to the call being made early.
From the digital flight data recorder (DFDR), both left and right electronic engine controls (EEC) reverted to the soft alternate mode of operation.4 5 This condition led to the disengagement of the autothrottle within the next 2 seconds.
Vi= 90 knots – The autothrottle disengaged and the master caution light illuminated. The captain cancelled the caution light and verified that proper thrust was set.
Vi= 105 knots – The FO flight director commanded a 15° nose-up pitch attitude on FO's PFD.
Vi≈ 139-149 knots –As the automated V1 call6 was heard, the FO noticed his airspeed was low and that there was an AIRSPEED DISAGREE alert7; he transferred control to the captain who assumed control of the airplane and continued the take-off.
Vi= 150 knots – The DFDR showed aft movement of the captain's control column, indicating that the captain had initiated rotation.
Vi= 154 knots – The aircraft pitch attitude began to change, and the aircraft began to rotate.
Vi= 166 knots – The DFDR indicated that the airplane lifted off.
Vi= 179 knots – The pitch attitude reached 15°. According to the DFDR, the captain's flight control computer (FCC A) was commanding a nose-up pitch change of 1°, which would result in a pitch attitude of 16°. The FO's flight control computer (FCC B) was commanding a nose-down pitch change of 5°, for an aircraft attitude of 10°. This represented a 6° disparity between the 2 flight control computers.
Vi= 179 knots – The radio altimeter height reached 219 feet above ground level (agl). Both flight control computers put out a discrete bias-out-of-view (BOV) signal,8 followed by a computed value, a discrete zero value, and a computed value. This sequence was cycled through 4 times, for a duration of 14 seconds. At this time, the radio altimeter indicated a height of 791 feet agl. There was no corresponding movement of the flight director pitch command bars.9
Around this time, both pilots felt the stick shaker activate for an estimated 6 to 8 seconds. Simultaneous with the stick shaker activation, a sound, assumed to be the overspeed clacker, was heard on the captain's headset. The FO did not hear a clacker. The DFDR contained no indication of the stick shaker at this point in the flight or of the clacker at any point during the flight. The nose was lowered slightly while maintaining a positive climb attitude, and TOGA power was confirmed.
The DFDR indicated that the captain's flight control computer, when not biased out of view, was putting out a flight director command to return to a 15° nose-up pitch attitude, a command that would put the flight director pitch command bar very close to the airplane reference symbol on the display. The flight director command bar indicated to pitch nose down to an attitude 5° below the horizon bar on the PFD.
Usually the autopilot would be engaged at this point. However, use of the autopilot is discretionary, and the captain elected not to engage it. The captain maintained a 12 to 15° pitch attitude and a positive rate of climb by direct reference to the attitude instruments and the outside horizon. The flight director speed selector was increased to the appropriate climb speed in accordance with normal procedures.
|1:08||The TOGA mode disengaged and the BOV behavior of the flight director pitch commands ceased. There continued to be erroneous flight director pitch commands.|
Vi= 189 knots, altitude ≈ 2000 feet10 –The right-hand stick shaker activated for 3 seconds. This was the only recorded activation of either stick shaker on the DFDR for the duration of the flight. The speed selector was increased further to facilitate a speed increase in order to avoid further stall warnings.
Flaps and slats were selected up, and the aircraft climbed to approximately 2400 feet, as indicated on the captain's altimeter11.
The crew carried out the "Airspeed Unreliable" checklist in accordance with the Quick Reference Handbook (QRH). It was determined that the left airspeed and altitude were reliable and that the right airspeed was erroneous by reference to the standby airspeed, ground speed, pitch angle for the power setting, and comparison between the captain's and FO's instruments.
The aircraft climbed to 3000 feet. Runway heading was maintained for approximately 8 minutes after take-off.
There was a ceiling of 1800 feet agl (2400 feet indicated altitude) at Toronto/ Lester B. Pearson International Airport; however flight at 3000 feet was mainly clear of cloud, with only brief intermittent flight in cloud or loss of visual contact with the ground.
Reaching 3000 feet, the crew engaged altitude mode on the flight director, and the flight director commands displayed on the captain's side were normal.
The crew received brief altitude disagree indications. This is attributed to the right ADIRU applying an incorrect position-error correction to the right altitude, which resulted in an error that exceeded the criteria for the disagreement alert.
The crew did not declare an emergency; however, due to an overweight landing, they requested to have aircraft rescue and firefighting (ARFF) services on standby for the landing as a precaution against hot brakes.
The Toronto departure controller advised SWG 531 that it was transmitting the sound of an open microphone with no discernible voice. A similar problem had occurred prior to take-off. The problem persisted for about 6 minutes; the captain resolved it by manipulating the switch that selects either the microphone or oxygen mask.12 There was no further communication difficulty during the flight.
The aircraft landed without further incident. The brakes did not overheat.
The captain debriefed the operator's maintenance personnel on the event. This included a discussion regarding what events to record in the aircraft journey log. In the end, only the airspeed unreliability and overweight landing were entered in the journey log.13 The captain also submitted a company Flight Safety Report (FSR) which mentioned the airspeed unreliability and the stick shaker. The third report that the captain filed was to the chief pilot; this report included the stick shaker, overspeed clacker, erroneous flight director command, altitude unreliability, autothrottle disengage, EEC reversion to alternate mode, and the radio problem.
Weather at the time of the occurrence was reported as follows: overcast clouds at 1800 feet agl, visibility of 15 statute miles, and winds from 290° true at 11 knots. The temperature was 1 °C and the dew point was -2 °C, with an altimeter setting of 29.98 inches of mercury.
Weather did not contribute to the occurrence. Conditions facilitated the crew's reliance on outside visual references until they determined which cockpit instruments were reliable and which were not.
|Pilot licence||Airline transport||Airline transport|
|Medical expiry date||01 May 2011||01 October 2011|
|Total flying hours||7500||5000|
|Hours on type||3000||3700|
|Hours last 90 days||240||169|
|Hours on type last 90 days||240||169|
|Hours last 30 days||80||43|
|Hours on type last 30 days||80||43|
|Hours on duty prior to landing||2.5||2.5|
|Hours off duty prior to work period||72||72|
Records indicate that the flight crew members were certified and qualified for the flight in accordance with existing regulations. Both were returning from 72 hours off duty and were well rested.
Records indicated that the aircraft was certified, equipped, and maintained in accordance with existing regulations and approved procedures.
A schematic of the Boeing 737-800 aircraft pitot-static system is presented in Appendix D. The left and right-hand pitot tubes are each attached to an air data module (ADM) that converts total pressure to an electrical signal that is sent to the respective ADIRU. The ADIRU calculates airspeed and altitude based on the static and total pressures, including correction for position error. Each ADIRU provides the information derived from the pitot-static system to other aircraft systems, including the flight director, the stall management/yaw damper (SMYD), and the display electronic unit (DEU). These systems are duplicated on the left and right side of the airplane and operate independently. The DEUs are connected to a single ARINC 429 digital data bus which provides for information exchange to other aircraft systems, including cockpit displays and EECs. The aircraft is also equipped with an integrated standby flight display, which includes airspeed and altitude displays from an independent third source.
The maintenance crew carried out diagnostic checks of the aircraft systems and found no faults, although the air data and inertial reference system (ADIRS) had recorded 3 AIRSPEED DISAGREE events. The right pitot tube was inspected for foreign material and none was found.
On 11 March 2011, 2 days before this incident, the aircraft's right-side pitot tube struck an owl during a take-off in Puerto Vallarta, Mexico. Journey log entries indicate that the EECs reverted to the soft alternate mode, and the FO's airspeed indication was erroneous. The flight returned to Puerto Vallarta; the pitot tube was cleaned, and the aircraft was returned to service. The aircraft flew 5 flights without any related malfunction until the occurrence flight.
Based on the earlier occurrence and in the absence of the definitive finding of a defective component, the maintenance crew replaced the right-side pitot tube and the corresponding ADM. The aircraft was returned to service and the problem has not recurred since. The ADM was returned for overhaul. An inspection prior to overhaul found that there were no defects, debris or foreign material, and it was functioning normally.
Boeing advisory on erroneous airspeed indications
The airworthiness standards for transport category aircraft require that:
…airplane systems and associated components, considered separately and in relation to other systems, must be designed so that…the occurrence of any failure condition which would prevent the continued safe flight and landing of the airplane is extremely improbable, and the occurrence of any other failure conditions which would reduce the capability of the airplane or the ability of the crew to cope with adverse operating conditions is improbable.14
The standard also states that "warning information must be provided to alert the crew to unsafe operating conditions, and to enable them to take appropriate corrective action."15
In September 2010, Boeing issued an advisory to Boeing 737NG16 operators regarding flight crew and airplane system recognition of and response to erroneous main display airspeed situations. In this advisory, Boeing indicated that erroneous airspeed events may compromise the safety of flight, describing the issue as follows:
The rate of occurrence for multi-channel unreliable airspeed events combined with probability of flight crew inability to recognize and/or respond appropriately in a timely manner is not sufficient to ensure that loss of continued safe flight and landing is extremely improbable.17
The Boeing advisory indicated that the issue extended also to other aircraft models, and that there were a number of factors, including environmental conditions, human factors, and/or hardware failures, that may contribute to the higher-than-predicted frequency of occurrences. It also indicated that the content of Non-Normal Checklists may delay crew response, or may contribute to this delay.
The safety oversight component of Sunwing's safety management system (SMS) has a proactive process that analyzes hazards. Sunwing received the notice from Boeing. Although Boeing had noted that the flight crew training curriculum did not require recurring training for an erroneous airspeed condition and that such events were occurring more frequently than predicted, Sunwing did not consider the notice as a statement of a hazard that should be analyzed by its proactive process. Therefore, the document was not circulated to flight crews.
On 22 March 2012, Boeing issued an update to this advisory, indicating that it had found no single root cause for the issue. Boeing identified training and procedural measures to mitigate the problem, and stated that changes were being developed for inclusion in the FCOM and maintenance manuals and that supplemental material was being developed for inclusion in the flight crew training manual, with an estimated completion date of 09 October 2012.
Sunwing Airlines Inc. holds an air operator certificate and is an approved maintenance organization. Sunwing began its operations in November 2005 and provides domestic scheduled and non-scheduled flights to destinations in the Caribbean, Mexico, and the United States from hubs at the Toronto–Lester B. Pearson International Airport and the Montréal–Trudeau International Airport.
The company employs more than 500 personnel, including aircraft maintenance engineers (AME), flight attendants, and pilots. In addition, personnel of various disciplines are employed to support the operation, including dispatchers, crew schedulers, and support personnel for operations and maintenance control functions.
The Sunwing fleet at the time of the occurrence consisted of leased Boeing B737-800 aircraft. The fleet size varied seasonally from about 4 to 20 aircraft.
Safety management systems
In 2005, the Canadian Aviation Regulations (CARs) were amended to require the holders of certain Canadian Aviation Documents, including air operator certificates issued under section 705.07 of the CARs, to establish, maintain, and adhere to an SMS. SMS have been adopted internationally by the International Civil Aviation Organization (ICAO), to which Canada is a signatory.
An SMS is designed to systematically integrate hazard identification and risk management into a company's operations and to become part of the way it does business, throughout the organization. This means that safety management is no longer a separate activity within the company structure. Companies operating under Part 705 of the CARs are required to have in place a documented SMS which includes, in part:
- a safety policy that the accountable executive has approved and has communicated to all employees;
- a policy for the internal reporting of hazards, incidents or accidents, including the condition under which immunity from disciplinary action will be granted;
- procedures for the collection of data relating to hazards;
- procedures for analysing and for taking corrective actions;
- procedures for establishing and measuring performance goals, making progress reports, and reviewing the safety management system to determine its effectiveness.18
Notably, an SMS includes:
- a reactive process that reports, investigates, analyzes, and corrects reported hazards, events, and safety concerns; and
- a proactive process that seeks to identify potential hazards and evaluate the associated risks before adverse events occur.
Although 705 operators are not required to have a quality assurance (QA) program, their safety management plan must include a review of the safety management system to determine its effectiveness.19
Transport Canada carries out assessments of operators' SMS to determine their effectiveness. These consist of a documentation review and an on-site review of the entire organization to determine if the SMS is documented, in place and effective. In addition, Transport Canada conducts program validation inspections (PVI), which consist of a focused review of 1 or more components of an organization or its SMS.
Transport Canada conducts surveillance of an operator's overall SMS processes rather than detailed prescriptive oversight of individual activities and actions, as was previously practiced. Transport Canada's guidance to inspectors conducting SMS assessments and PVIs at the time of the occurrence stated:
The introduction of Safety Management Systems (SMS) for the aviation industry will fundamentally change the way Transport Canada (TC) approaches its oversight responsibilities… Traditional oversight methods focused solely on determining regulatory compliance using a system of direct inspection of an organization's aircraft, personnel, records and other systems. The new approach employing Assessments and PVIs will allow TC's oversight to evolve beyond compliance auditing to a system that examines the effectiveness of an organization's management system. These changes are consistent with the principles of safety management systems where the organization is expected to take an ownership role in proactively managing risks and have programs in place to ensure they comply with regulatory requirements. TC's role is to ensure that organizations have effective policies, processes and procedures in place to accomplish this.20
As an operator's SMS matures, TC oversight would shift from a traditional audit and inspection to process auditing. The monitoring of SMS outputs would increasingly focus on the results of the operator's QA program.
Sunwing's safety management system
Sunwing has established an SMS in accordance with section 705.07 of the CARs. Sunwing's Safety Management System Manual was published in May 2006 and underwent several revisions as the airline prepared to comply with the CARs SMS requirement. At the time of this occurrence, the Manual was at revision 7, dated 30 June 2009. The SMS Manual contained material concerning the operator's organizational structure and the overall design and function of the SMS within the company, as well as the identification of the accountable executive (the company president), and the roles and responsibilities of key players in the SMS, including the Safety Office. It defined the safety management plan, document management processes, safety oversight, training, quality assurance, and emergency preparedness and response.
The SMS Manual indicated that Sunwing had completed the initial development of its SMS, including document management and training components, reactive reporting and proactive reporting, and hazard analysis elements. The Manual detailed the process for the development, review, and promulgation of the company safety and non-punitive reporting policies, both of which were appended to the Manual. The Manual also contained the associated documentation and communication processes, including written, oral, and electronic methods, and various ways of involving employees, such as meetings, surveys, contests, and suggestion programs.
The safety oversight section identified reactive and proactive processes as the 2 principal means of safety oversight within the company. Reactive processes were identified as those resulting from occurrence reports; proactive processes are those resulting from safety assessments, hazard reports, and evaluations. The Manual identified the policy basis for occurrence reporting and specified the company forms and methods that may be used by employees to prepare and submit reports. Both the reactive and proactive procedures begin with an employee identifying an event or safety hazard in a report to the Safety Office. The SMS Manual detailed the processes for handling the report between the Safety Office and the department involved, and it provided details for how to carry out hazard and risk analyses.21
For reactive reporting, the SMS Manual specified events for which safety reports were mandatory. SMS training for employees included a module entitled "Reportable events – what to report". However the reportable events were not documented.
Under the proactive process, the SMS Manual identified the requirement to conduct a hazard analysis before significant changes to the company operation were made, including, but not limited to: addition of aircraft of an existing type to the fleet, addition of a new aircraft type, procedure changes affecting operational safety, changes to the company's organizational structure, key personnel or lines of reporting or communication.
Both the reactive and proactive processes incorporated an early risk assessment, used to determine the level of response and type of investigation. A full investigation was to be carried out if the event was recognized as posing a significant risk to the company.
The quality assurance aspect of the SMS Manual focused on the functioning of SMS processes within the company, with the objective of assuring regulatory compliance and conformity of work practices to documented processes, as well as assessing effectiveness of these processes.
Sunwing's SMS was reviewed by Transport Canada in September 2009. At this time, Transport Canada carried out an SMS assessment that included a review of Sunwing's policy and procedures manuals for conformance to the applicable regulations. It also included an on-site review to assess the level of knowledge pertaining to individual duties and responsibilities and to determine if the organization's documented processes and procedures were available to use. As a result of this assessment, TC found several deficiencies with Sunwing's SMS.
The following findings were most relevant to this occurrence:
- Hazard analysis procedure not practiced as per SMS Manual: the example given was that hazard analyses concerning operating at a new airport had been carried out, but not documented. Another example given was that, although there was a process for analyzing the effect of changes to key personnel, there was no documentation to state which personnel were key. The company's corrective action plan proposed to update the hazard analysis training process, provide guidance on when and how to complete and document the process, amend the Flight Operations Manual to include requirements for documentation and storage of hazard assessments, and to document the definition of key personnel in the SMS Policy Manual.
- Investigation procedure: the SMS assessment found that procedures for the conduct of investigations were not detailed in the SMS Manual, and safety coordinators were unable to explain the process. The company's corrective action plan indicated that a review of the effectiveness of the investigation procedure had been completed and that it would be documented in the SMS Procedures Manual. It proposed to provide training to the relevant personnel.
- Incomplete training of investigators: the SMS assessment found that safety coordinators had not received training specific to their responsibilities, and that the training program did not address all of their responsibilities. The assessment also found that the SMS Manual did not contain a process to ensure the competency of safety coordinators. The corrective action plan proposed to update the relevant training, develop competency requirements, and document the competency assessment process.
The company's follow-up corrective action plan was reviewed and accepted by Transport Canada.
Operator's response to this occurrence
This occurrence was not recognized at the time as being sufficiently serious in nature to warrant calling in company safety personnel or as an occurrence that had to be reported to the TSB. Therefore, no immediate action was taken that would assist in an investigation, such as the preservation of flight data and cockpit voice recordings.
The captain submitted a report to the maintenance department, a company FSR to the safety department and a separate written report to the chief pilot through routine channels. None of these separate reports triggered recognition of the potential risk of this occurrence. The operator's SMS did recognize the declaration of an emergency as a TSB-reportable occurrence. However, in this case, the crew did not declare an emergency; rather, they requested that ARFF be placed on standby as a precaution due to the potential for hot brakes after an overweight landing. This was reported in a NAV CANADA Aviation Occurrence Report (AOR).
The FSR was written and submitted the same day, but was not received in the company Safety Office until 2 days after the incident. The type of occurrence was indicated by ticking off the "Warning or Alert" and "Emergency Declaration" boxes on the form. However, another box entitled "Emergency Declared" was ticked off as "No", and a narrative stated that no emergency was declared, but that emergency crews were asked for upon landing and had followed the aircraft to the gate. It was also explained that the stall warning (stick shaker) was preceded by unreliable airspeed at V1.
The TSB Regulations define any request for standby of emergency response services and difficulties in controlling the aircraft owing to an aircraft system malfunction as being reportable occurrences for an aircraft of the weight of a Boeing 737. The operator did not notify the TSB until after the NAV CANADA AOR was publicly reported through the Transport Canada Civil Aviation Daily Occurrence Reporting System (CADORS), and then only to explain that the occurrence was not considered a reportable incident by the company. The full nature of the occurrence was not known for several days, and only after the TSB made further enquiries and Sunwing's Safety Office obtained a copy of the report that the captain had submitted to the chief pilot.
The FSR has a provision for pilots to suggest preventative action, but there is no provision to identify the seriousness of risk or level of urgency. Neither the operational nor safety organizations within the company recognized that there was a risk that warranted further assessment within the operator's SMS program framework. It was treated as an abnormal condition addressed by the flight crew, with a successful outcome.
The operator then carried out an assessment of the occurrence. Its draft SMS report identified 2 issues. The first was the airspeed disagreement, for which the root cause was concluded to be a technical issue (pitot tube contamination) requiring no further analysis. The second was that the event was not classified as a TSB-reportable occurrence. This was attributed to weaknesses in crew training: crew members were not aware that requesting emergency vehicles on standby was indicative that an emergency condition existed, and therefore reportable to the TSB. This was to be remedied by amending crew training programs. The occurrence was to be added to the company database for trend analysis.
Sunwing Airlines Flight Crew Operation Manual
The following procedures contained in the Sunwing Airlines FCOM are relevant to this occurrence:
Transfer of control: There are several procedures in the FCOM that require the transfer of control between the 2 flight crew, including non-normal situations. It is stated or implied that either pilot should be ready to take control if necessary. The procedure for transferring control is as follows:
1.23 TRANSFER OF CONTROL
The PIC will determine the (PF) and the (PM) prior to flight. At all times there will be a clear understanding of who is controlling the aircraft. The acceptable method of transferring control is by stating:
"I HAVE CONTROL", acknowledged by "YOU HAVE CONTROL"The Flight Crewmembers may switch PF/PM duties at any time, as long as there is a clear understanding of duties and a clear understanding of which Pilot is the PF.22
Rejected take-off: The FCOM procedure for a rejected take-off is presented, along with other relevant extracts from the FCOM relating to rejected take-offs, in Appendix B. The FCOM indicates a number of malfunctions that would be cause for reject under 80 knots. Above 80 knots, the following guidance is given:
The takeoff above 80 kts (high speed regime) will be rejected immediately in the event of an engine failure, engine fire, unsafe configuration, predictive windshear warning or any other situation adversely affecting the safety of flight. Once thrust is set and the takeoff roll has been established, rejecting a takeoff solely for illumination of the Master Caution Light is NOT recommended.23
Airspeed unreliable: The unreliable airspeed procedure is presented in Appendix C. It is silent as to the disengagement of the autopilot and as to the selection or reselection of the master flight director. There is nothing in the QRH to indicate that airspeed unreliability could be caused by conditions that might produce erroneous flight director commands or false stall or overspeed warnings.
Declaration of emergency: The FCOM notes the need for emergency vehicles at the conclusion of certain non-normal procedures that might involve hot brakes or passenger evacuation. There was nothing in the FCOM or other Sunwing documentation that differed from the guidance in TP14371, the Transport Canada Aeronautical Information Manual (TC AIM):
An emergency condition is classified in accordance with the degree of danger or hazard being experienced, as follows:
The radiotelephone distress signal MAYDAY and the radiotelephone urgency signal PAN PAN must be used at the commencement of the first distress and urgency communication, respectively, and, if considered necessary, at the commencement of any subsequent communication.24
- Distress: A condition of being threatened by serious and/or imminent danger and requiring immediate assistance.
- Urgency: A condition concerning the safety of an aircraft or other vehicle, or of some person on board or within sight, which does not require immediate assistance.
Rejected take-off studies
In 1990, a National Transportation Safety Board study25 found that the potential for accident was high following a high-speed (at or above 100 knots) reject. The study found indications that high-speed rejects were often unnecessary or improperly performed. The report made several recommendations relating to policies, procedures, and training for rejected take-offs and 1 recommendation to redefine V1 to better convey its meaning.
As a result, the United States Federal Aviation Administration, in co-operation with major aircraft manufacturers, prepared the Takeoff Safety Training Aid,26 which examines the various performance risk factors associated with rejected take-offs. It is noted that the common names associated with V1 speeds—critical engine failure recognition speed, take-off decision speed, and go/no-go speed—are misleading because they fail to imply that recognition and decision have to precede V1 in order to safely reject a take-off and achieve the stopping distance determined by certification methods.
Since then, non-normal procedures for transport category aircraft do not advise rejecting at high speeds for relatively minor malfunctions. The training recommended by the Take-off Training Safety Aid reinforces that high-speed rejects should be avoided except in certain critical situations. Boeing and Airbus have both published guidance material that is consistent with this approach, with Boeing defining 80 knots as the demarcation between high speed and low speed for the purposes of rejects.
A review of unreliable airspeed events indicates that there is a risk of significant loss of life if crews do not respond appropriately:
- February 1996 – After taking off from Puerto Plata, Dominican Republic, a Boeing 757 crashed, causing189 fatalities, as a result of erroneous airspeed indications, most likely due a blocked pitot.
- October 1996 – Shortly after take-off from Lima, Peru, a Boeing 757 crashed as a result of erroneous airspeed and altitude, due most likely to partially blocked static ports.
- February 2006 – A National Jet Boeing 717-200 (VH-NXH) experienced erroneous airspeed indications and stick shaker activation, most likely due to ice restricting movement of the angle-of-attack sensors.
- 01 June 2009, Air France flight 447 (Airbus A330-203) en route from Rio de Janeiro to Paris – The report indicated that there were airspeed indication discrepancies leading up to and during the sequence of events that culminated in the uncontrolled descent of the aircraft into the Atlantic Ocean with 228 fatalities.27
- 19 June 2009, LOT Polish Airlines flight 2, Boeing 767-30028 – Erroneous instrument indications resulted in airspeed and altitude deviations. Erroneous captain's airspeed and altitude indications were not correctly identified. The maintenance crew found no fault in the aircraft's systems, and the aircraft operated for another month before the difficulty recurred. An intermittent fault was found in the left-side air data computer.
The aircraft was equipped with a DFDR and a cockpit voice recorder (CVR). The recorders were not removed and read following the occurrence, and the aircraft was returned to service. The CVR was overwritten before the investigation began.
Data from the DFDR were downloaded a few days after the event, but did not contain any record that matched the date and time of the occurrence. It was determined that the date channel on the DFDR was recording erroneous information. No flight could be found with a profile matching the occurrence flight, and it was concluded that the DFDR had been entirely over-written since the occurrence.
The operator routinely downloaded DFDR data for use in its engine monitoring program, but for no other purpose. These files contained complete sets of DFDR parameters. Files were obtained for 17 flights including several before and after the incident flight. One entire flight was missing. DFDR time comes from the aircraft's clock, which may be incorrectly set when the battery is replaced. In this instance, the error was noted by the engine monitoring office, and the clock was reset and the engine monitoring data corrected. However, the original flight data recorder (FDR) file remained unchanged. The operator's Safety Office was unaware of the discrepancy when the files were provided to the TSB because it is not involved in the engine monitoring program.
DFDR data for this occurrence is presented in Appendix A.
The following TSB Laboratory report was completed:
- LP029/2011 - FDR Analysis
This report is available from the Transportation Safety Board of Canada upon request.
The analysis will focus on the risks when flight crews are faced with unresolved, ambiguous instrument indications during the high-speed phase of the take-off run, through rotation and initial climb-out.
In this occurrence, had the aircraft not been in visual conditions, the crew may not have had the visual cues to support its decision not to follow the flight director when it commanded a 5° pitch-down attitude at low level, after the system automatically switched to the master flight director at 400 feet above ground level (agl).
In addition, the analysis will examine the operator's safety management system (SMS) response to the Boeing advisory and following this occurrence.
The investigation was inconclusive as to the source of the nose-down pitch command that was seen by the captain. The digital flight data recorder (DFDR) data indicated that the captain's flight control computer was putting out an appropriate command for a climbing attitude. Also unexplained and not recorded on the DFDR was the first stall warning (stick shaker) at about 400 feet agl in the climb. A second stick shaker was recorded a short time later on the right and was the result of the right air data system computing an erroneous airspeed. The cause of the erroneous airspeed was not ascertained, and no system malfunctions were identified by diagnostic checks performed after the occurrence. The right pitot tube and the air data module (ADM) were replaced, and no faults were found in either unit when examined by the overhaul facility.
In view of the Boeing advisory related to erroneous airspeed that could compromise safety of flight, and the varied nature of possible causes, this investigation focused on the defences in place to mitigate the risk. Amongst these defences is crew recognition and response, including the decision not to reject the take-off when the airspeed discrepancy was first detected, the decision to transfer control during a critical phase of flight just before V1, and the decision not to declare an emergency during the return to Toronto.
During the take-off run, the first office (FO) was unaware of a discrepancy in airspeeds until the captain made the 80-knot call. It was possible that the 80-knot call was made early. At that point, the autothrottle disengaged, and the captain dealt with this. The attention of the crew was not on the possibility of an airspeed discrepancy. By the time the FO was certain that there was an airspeed issue, the airspeed had almost reached V1.
Control was transferred by the FO to the captain immediately before V1. At this point, the alternatives were to reject the take-off, which also requires control to be transferred to the captain, or for the FO to continue the take-off without a reliable indication of airspeed and without valid flight director commands. Transfer of control is required in a number of situations. Readiness for that eventuality is advised in the Flight Crew Operating Manual (FCOM) and is regularly practiced; the pilot monitoring has his hands ready to take control during all critical phases of flight.
Crew training and guidance in the FCOM caution against a reject at this point, unless the situation is a serious threat to flight safety. The captain clearly did not perceive such a situation, as indications on his side of the cockpit were normal. The decision to reject rests solely with the captain. If the FO considered a reject necessary, the FO would need to convey this to the captain, who would have to assess the situation and decide if a reject is necessary. It is unlikely that this could have been accomplished before V1.
During the return for landing, the crew opted not to declare an emergency when asked by air traffic control (ATC). The crew had complete control of the aircraft and did not consider the flight to be in a situation of distress or urgency, as defined in the Transport Canada Aeronautical Information Manual (TC AIM). However, they requested aircraft rescue and firefighting (ARFF) services to be on standby due to the overweight landing and the potential for overheated brakes. Overweight landings are not uncommon, and the practice of requesting ARFF to confirm that brakes are not overheated before the aircraft enters the ramp area is also routine. By requesting ARFF to be on standby, the crew did not understand that, as per TSB,29 International Civil Aviation Organization (ICAO),30 NAVCANADA,31 and Transport Canada32 documentation, the incident was now considered a reportable occurrence.
There is little guidance given to the crew on dealing with unreliable airspeed indications during and immediately after take-off. Unreliable airspeed is not identified as a condition that constitutes a threat to safety of flight in the context of rejecting a take-off at high speed. This is not inappropriate, considering the potential dangers associated with high-speed rejects and the uncertainty as to the actual speed when airspeed is unreliable.
However, the result is that the crew will continue the take-off and initial climb before running the checklist procedure to ascertain which instruments are correct. When the stick shaker activated 12 seconds after take-off, the normal response to this stall warning would have been to lower the nose to reduce the angle of attack. The flight director also commanded to lower the nose. In this phase of flight, loss of terrain clearance is a real risk. With good visual cues, the captain was able to balance these competing indications and climb to a safe altitude.
Only after completing the critical initial climb, and dealing with erroneous stall warnings, a possible overspeed warning and misleading flight director commands, could the crew carry out the Quick Reference Handbook (QRH) procedure and confirm that the right airspeed indication was erroneous. The QRH makes no mention of erroneous flight director or stall warning system performance. It does not provide any indication to deselect the flight director on the same side as the erroneous air data, nor does it otherwise caution against the risk of invalid flight director commands. The crew did not use the autopilot and it maintained a flight director selection that resulted in erroneous flight director commands continuing to be displayed. This is because the right flight director was the master which relied on the erroneous right pitot pressure. Continued use of erroneous guidance in adverse weather could seriously compromise flight safety.
The Boeing advisory on erroneous airspeed events acknowledges that the air data system is part of a complex multi-channel integrated system and that technical failures can have varying causes. In such systems, not only can there be multiple potential sources of failure, there can also be multiple potential symptoms of failure. With a higher-than-predicted rate of occurrence of such failures, and problems with appropriate recognition and response by flight crews, the probability of continued safe flight and landing may be less than predicted when the airplane was certified.
Sunwing received the notice from Boeing. Although Boeing had noted that the flight crew training curriculum did not require recurring training for an erroneous airspeed condition and that such events were occurring more frequently than predicted, Sunwing did not consider the notice as a statement of a hazard that should be analyzed by its proactive process. Therefore, the document was not circulated to flight crews.
The overspeed clacker heard by the captain (but not by the FO) is not explained by low airspeed in the right-hand air data system. It was heard at the same time as the stick shaker activation, at a time when there was a problem switching between the boom and mask microphones in the cockpit. In all likelihood, the clacker sound on the headset was attributed to the mask microphone being active and picking up the sound of the stick shaker. There was no other indication of an overspeed warning. The flight crew experienced and accepted issues with the boom and mask microphones during the occurrence. The acceptance by flight crews and companies of known equipment problems, such as this microphone problem, could put safety of flight at risk.
The investigation was hampered by problems with DFDR data and an overwriting of the cockpit voice recorder (CVR). Since the occurrence was not initially recognized by the operator as safety-significant, several days passed before DFDR data were examined and found to be invalid. These issues could have been more easily pursued and resolved had the operator either identified the occurrence as being indicative of a risk that warranted assessment under its SMS proactive process, or as being reportable to the TSB under its SMS reactive process.
The recognition of hazards and risk management are central to the SMS concept, which underpins the regulation of CAR 705 air operations in Canada. In this occurrence, the operator did not recognize any hazards worthy of analysis by its SMS. The effective performance of the crew masked the underlying risks that may not be mitigated by the lack of guidance, training and procedures available to them.
The occurrence, from the point of view of the operator's maintenance organization, was limited to the overweight landing and the airspeed discrepancy that were reported to it via the aircraft journey log entries. The radio problem, the overspeed warning, the stall warning, the electronic engine control (EEC) reversion to the soft alternate mode, and the misleading nose-down flight director command all occurred, but were not documented in the journey log and were not addressed. The aircraft was returned to service without resolution of these defects and, as a result, airworthiness of the aircraft was not assured. The operator's draft SMS report did not recognize this as a risk that warranted further analysis by its SMS, leading to a missed opportunity to identify hazards and reduce risk.
Under an SMS, the scope of an investigation hinges on the preliminary identification of the hazards and risks. Transport Canada's (TC) SMS guidance material and TSB regulations identify specific types of reportable occurrences which may then be subject to a more in-depth investigation and analysis. For complex events such as this occurrence, hazards may not be obvious until the event is investigated for the specific purpose of identifying underlying factors and conditions that may be hazards. This may involve methods such as "what-if" analyses which, according to the TC guidance material, come much later in the investigative process than the preliminary assessment. The result is that Sunwing's SMS processes at the time did not assure that events were adequately investigated to identify hazards that could have had serious consequences in other circumstances.
TC's oversight of CAR 705 operators consists of assessing the effectiveness of their SMS processes. It expects the operator to identify safety issues, carry out risk assessments, and define corrective measures or mitigate risks in accordance with the SMS processes defined by the operator. If an operator's SMS is not effective, hazards may not be identified and risks may not be mitigated. TC oversight validates that the operator's report of such investigations complies with the reporting requirements in the operator's approved SMS manual. TC does not directly corroborate the comprehensiveness of the operator's investigation or adequacy of its scope or depth. TC's role is to ensure that organizations have effective policies, processes, and procedures in place to accomplish this. In this framework, it is not TC's role to identify specific hazards missed by an operator. When an operator's SMS is ineffective, there is an increased risk that hazards will not be identified and mitigated. During the transition to SMS, TC needs to recognize this risk and adjust its oversight activities to be commensurate with the maturity of the operator's SMS.
Findings as to causes and contributing factors
- A failure in the right pitot-static system caused the output of erroneous airspeed data from the right air data and inertial reference unit. This resulted in erroneous airspeed indications, stall warnings, and for unknown reasons, misleading flight director commands being displayed on the aircraft instruments during take-off and initial climb.
Findings as to risk
- When an operator's proactive and reactive safety management system processes do not trigger a risk assessment, there is an increased risk that hazards will not be mitigated.
- Operators that do not recognize this type of event as a reportable aviation occurrence may not report it, conduct an investigation to further analyze or mitigate the risk, or preserve data from the digital flight data recorder to facilitate an investigation.
- If operators do not thoroughly document aircraft malfunctions, there is an increased risk that aircraft deficiencies will not be completely corrected before the aircraft is returned to service.
- If cockpit and data recordings are not available to an investigation, this may preclude the identification and communication of safety deficiencies to advance transportation safety.
- The acceptance by flight crews and companies of known equipment problems, such as the boom and mask microphones switching problem, could put safety of flight at risk.
- During the transition to safety management systems, Transport Canada must recognize that operators may not always identify and mitigate hazards and adjust its oversight activities to be commensurate with the maturity of the operator's safety management system.
Safety action taken
Sunwing Airlines Inc.
The reactive safety management system reporting process has been updated to include a review of the Transportation Safety Board of Canada's criteria for reportable accidents and incidents to facilitate the timely reporting of occurrences and determination of the scope of the investigation.
This report concludes the Transportation Safety Board's investigation into this occurrence. Consequently, the Board authorized the release of this report on 23 January 2013. It was officially released on 27 February 2013.
Appendix A - Flight data recorder33
Appendix B - Boeing 737-800FCOM - Reject information34
Sunwing flight crew operation manual35
Appendix C - Boeing 737-800 QRH - Airspeed unreliable36
Appendix D - Boeing 737-800 pitot-static system schematic37
The Boeing 737-800 aircraft has three separate and independent pitot-static systems:
- Two primary systems, one on the left side of the airplane, the other on the right. Each system comprises a single pitot probe on the same side as the system connected to a pitot air data module (ADM) and two static ports, one on each side of the airplane, connected together to a static ADM. The ADMs convert the air pressure to an electrical signal that is sent to the respective air data inertial reference unit which calculates the values used in the respective primary flight display.
- One alternate system, comprising a pitot probe and a pair of static ports that are connected directly to standby altitude and airspeed displays.
- When the take-off and go-around (TOGA) switch is activated, the captain's flight director (left-hand flight director) provides information for the left-hand displays independently of the right-hand flight director, which generates the display information for the FO's displays. For redundancy during this critical phase, the flight director displays remain independent until the aircraft reaches 400 feet above ground (agl) during the climb, at which point the originally-selected (in this case the right) flight director resumes being the source for both displays. ↑
- All times are Eastern Daylight Time (Coordinated Universal Time minus 4 hours). ↑
- Unless otherwise noted, the indicated airspeed (Vi) is the computed airspeed from the left air data inertial reference unit (ADIRU). This value is displayed on the left (pilot-in-command) airspeed indicator and is recorded on the digital flight data recorder (DFDR). ↑
- The EEC is a full authority digital engine control (one for each engine) which commands engine speed based on, amongst other things, sensed flight conditions. Both EECs receive total pressure signals from both left and right pitot sensors and carry out a validity check of the signals against each other. In the event of loss of or invalid signals, the EEC automatically reverts to a soft alternate mode that controls the engine based on the last valid flight condition. The pilot can manually select a hard alternate mode that follows a predetermined mechanical schedule. A discrepancy between the left and right total pressures would cause both EECs to sense an invalid condition and revert to the soft alternate mode, resulting also in the autothrottle disengaging. ↑
- This was not noticed by the crew until other checks were carried out after take-off. The indicator light is on an overhead panel, out of the normal instrument scan by the pilot. The condition requires no immediate action by the crew. ↑
- The automated V1 callout is generated by the enhanced ground proximity warning system (EGPWS) computer, and delivered 3 knots below the V1 speed entered into the flight management computer. Computed airspeed is also provided to the EGPWS computer by both air data inertial reference units (ADIRU). Normally, the EGPWS uses data from the left ADIRU, unless it is flagged as invalid. The EGPWS will then use the data from the right ADIRU. ↑
- This alert indicates that the captain's and FO's indications disagree by more than 5 knots for 5 continuous seconds. ↑
- Certain defined conditions cause the flight control computer to bias out of view the flight director command bar that might otherwise display an erroneous command. In the take-off mode of operation, and below a radio altimeter height of 400 feet, one of the flight control computers also compares its computed command with the corresponding command from the other flight control computer. A difference greater than a set value results in a BOV signal. ↑
- The DFDR parameter for the flight director pitch command is recorded from the aircraft's flight data acquisition unit (FDAU); it is not a direct recording of the flight director display on the attitude indicators. The flight director command signal from the flight control computer is sent within the aircraft's data distribution architecture to the FDAU and to 2 display electronic units (DEU) that provide data to the flight crew's attitude indicator displays. ↑
- As per the captain's altimeter, and as recorded by the DFDR and corrected to the current Toronto altimeter setting of 29.98 inches of mercury ↑
- The Mode S transponder, using the right ADIRU as the source of altitude, indicated an altitude approximately 300 feet higher than that recorded by the DFDR throughout the remainder of the flight. This is attributed to the right ADIRU applying an incorrect position error correction due to incorrect total pressure from the right pitot system. ↑
- There is a switch in the cockpit that allows the crew to select either a boom microphone or a microphone embedded in the oxygen mask as the voice source for intercom and radio communications. It is known to be sensitive and can be difficult to position correctly. ↑
- The maintenance crew chief was not present during the entire debriefing and was unaware of the overspeed warning, the erroneous flight director command, and the radio problem. ↑
- Code of Federal Regulations, Title 14: Aeronautics and Space, Part 25—Airworthiness Standards: Transport Category Airplanes, section 25.1309, Equipment, systems, and installations. ↑
- Ibid. ↑
- Boeing 737NG includes New Generation models -600, -700, -700C, -800, -900, -900ER, and -BBJ. ↑
- Boeing Fleet Team Digest 737NG-FTD-34-10006, ATA 3400-00, Recognition and Response to Erroneous Airspeed by Flight Crew and Airplane Systems, dated 13 September 2010. ↑
- Canadian Aviation Regulations, section 705.152—Components of the Safety Management System. ↑
- Ibid. ↑
- Transport Canada, Staff Instruction (SI) No. SUR-001, Issue number 02, effective date 2009-02-06. ↑
- The material in the hazard analysis and risk analysis sections of the SMS Manual is similar to that of TSB's safety analysis process in the Integrated Safety Investigation Methodology (ISIM) and to the material contained in Transport Canada guidance documents. ↑
- Sunwing Airlines, Sunwing Airlines Flight Crew Operation Manual (Revision 10, version 2, 01 August 2009). ↑
- Ibid. ↑
- Transport Canada, Transport Canada Aeronautical Information Manual (October 18, 2012), SAR, section 4.1, Declaring an Emergency. ↑
- National Transportation Safety Board, Special Investigation Report (SIR-90/02), "Runway Overruns Following High Speed Rejected Takeoffs", PB90-917005 NTSB/SIR-90/02 (27 February 1990). ↑
- U.S. Department of Transportation, Federal Aviation Administration, Takeoff Safety Training Aid (revision 1, 02 April 1993). ↑
- Bureau d'Enquêtes et d'Analyses pour la sécurité de l'aviation civile, Interim Reports on the accident on 1st June 2009 to the Airbus A330-203 registered F-GZCP operated by Air France flight AF 447 Rio de Janeiro – Paris (France: July 2009, December 2009, July 2011 and the final report in July 2012). ↑
- TSB Report Number A09O0117 ↑
- Transportation Safety Board of Canada, Transportation Safety Board Regulations, Section 6. ↑
- International Civil Aviation Organization, Annex 13 to the Convention on International Civil Aviation, Aircraft Accident and Incident Investigation, Attachment C. ↑
- NAV CANADA, Aviation Occurrence Reporting Procedures, Version 3.2. ↑
- Transport Canada, TP4044–CADORS Manual, Annex A. ↑
- Transportation Safety Board Laboratory, LP029/2011 - FDR Analysis ↑
- Boeing, 737 Flight Crew Operations Manual (18 March 2011), MAN 1.2 and 1.3. ↑
- Sunwing Airlines, Flight Crew Operation Manual (revision 10, 01 August 2009 and revision 7, 10 September 2008). ↑
- Boeing, 737 Quick Reference Handbook (May 15, 2008). ↑
- Boeing, 737 Maintenance Manual (15 February, 2011). ↑
- Date modified: