11/19/15

 

8900.1 CHG 131

VOLUME 3  GENERAL TECHNICAL ADMINISTRATION

CHAPTER 44  ASSESS CONTINUING ANALYSIS AND SURVEILLANCE SYSTEM FOR PARTS 121 AND 135

Section 1  Safety Assurance System: Evaluating an Air Carrier’s Continuing Analysis and Surveillance System

3-3891    REPORTING SYSTEM. Use Safety Assurance System (SAS) automation. This section is related to SAS Element 1.1.3 (AW), Continuing Analysis and Surveillance System (CASS).

3-3892    OBJECTIVE. This section provides guidance and information on the design, implementation, functions, and other considerations of an air carrier’s CASS.

3-3893    GENERAL.

A.    Regulatory Requirements. Title 14 of the Code of Federal Regulations (14 CFR) part 121,  121.373 and part 135, 135.431 require an air carrier operating under part 121 or 135 to establish and maintain a CASS. These sections also allow the Federal Aviation Administration (FAA) to require revisions to an operator’s maintenance program based on deficiencies or irregularities revealed by the CASS.

B.    Background.

1)    The FAA implemented the regulatory requirement for a CASS in 1964 in response to safety concerns and discoveries of systemic weaknesses in the maintenance programs of some air carriers. We identified these concerns during accident investigations and FAA surveillance activities accomplished during the 1950s. The FAA introduced the CASS as an element of a Continuous Airworthiness Program in a 1964 rulemaking that contained other elements such as a manual, an adequate maintenance organization, a maintenance recordkeeping system, Required Inspection Items (RII), and more.
2)    There is a requirement for a CASS for air carriers operating under part 121 and  135.411(a)(2) applicability. CASS utilizes a systems-based approach, which permits an air carrier to identify and understand maintenance program deficiencies well enough to develop and implement permanent solutions for those discrepancies. CASS is a keystone of an air carrier’s ability to produce Airworthy aircraft on a consistent basis.

C.    Definitions.

1)    Audit. Scheduled or unscheduled formal reviews and verifications to evaluate compliance with policy, standards, and/or contractual requirements.
2)    Authority. The power to design or change fundamental policy or procedures without having to seek a higher level management approval. Authority is a permission; it is a right coupled with an autonomous power to accomplish certain acts or order others to act. Often, one person grants another authority to act as an employer to an employee, a corporation to its officers, or as a governmental empowerment to perform certain functions.
3)    Carried Out by the Certificate Holder or Other Person. The certificate holder must maintain operational control over maintenance that any person performs on its aircraft. Operational control includes independently determining the scope and type of maintenance that may be required, when to accomplish that maintenance, and if the maintenance was done in accordance with its manual and program, regardless of who accomplished the maintenance.
4)    CASS. The elements of the system are always working. For example, continuing surveillance means someone is always looking and collecting information. Continuing analysis means that someone is always analyzing the information that is always being collected.
5)    Corrective Action. An action designed to eliminate or mitigate a deficiency that has been identified within the air carrier’s maintenance program.
6)    Deficiency. A condition which is insufficient or incomplete or where something required is lacking. In a CASS, it is something that is missing from the air carrier maintenance program that should be there, or it is something that is there but not producing the desired results. Alternatively, it could indicate that the maintenance program documentation is not being followed. For example, a program element that has failed and is not working or a program element that has faults and is not working as it should are deficiencies.
7)    Effective. Producing or capable of producing a desired result. The maintenance program is producing the desired results when the following objectives are realized:

    Airworthy aircraft that have been properly maintained for operations in air transportation;

    Competent personnel;

    Adequate facilities and equipment; and

    All maintenance, preventive maintenance, and alterations are always performed in accordance with the certificate holder’s maintenance program and manual.

8)    Establish and Maintain. “To establish” means that the air carrier develops a CASS that is appropriate for the type and scope of its operation. “To maintain” means that the air carrier keeps its CASS current and appropriate in response to changes in the type and scope of its operation.
9)    Maintenance Program. The programs outlined in 121.367 and 135.425, other sections of part 121 subpart L, part 135 subpart J, and described in some detail in the current edition of Advisory Circular (AC) 120-16, Air Carrier Maintenance Programs.
10)    Maintenance. Inspection, overhaul, repair, preservation, and the replacement of parts, excluding preventive maintenance.
11)    Performance. The act of doing something successfully; the successful execution of an action. In the CASS, performance means that the maintenance program is being accomplished or executed as outlined in the air carrier manual.
12)    Person. An individual, firm, partnership, corporation, company, association, joint stock association, or governmental entity. It includes a trustee, receiver, assignee, or similar representative of any of them.
13)    Preventive Action. Action to eliminate or mitigate the cause or reduce the effects of potential nonconformity or another undesirable situation.
14)    Program. An organized list of procedures.
15)    Responsibility. The obligation to ensure that a task or function is successfully carried out. Responsibility includes accounting for actions related to the task or function. This is a key attribute of operational control.
16)    Risk. Risk is the degree of probability that hurt, injury, or loss will occur over a specific period of time or number of operational cycles. Risk has two elements: severity and likelihood. With regard to air carrier maintenance operations standards, the relationship between these two elements must be inverse.
a)    Severity. The type of harm that will be inflicted if a particular event occurs. For air carrier maintenance programs, severity should be expressed in qualitative terms as a consequence of failure: safety, operational, economic, or environmental.
b)    Likelihood. The estimated probability or frequency, in quantitative or qualitative terms, of an occurrence related to the hazard; an expression of the probability that a specific unsafe event will occur.
17)    Risk Mitigation. A risk control measure. It refers to the process of modifying the system in order to reduce the risk.
18)    Risk Management (RM). A formal process composed of identifying hazards, analyzing risk, assessing risk, and controlling risk. This process is embedded within the processes used to provide the product/service; it is not a separate process.
19)    Root Cause Analysis (RCA). The analysis of deficiencies to determine their underlying root cause.
20)    System. A functionally related group of elements. In the CASS, the elements are:

    Surveillance;

    Analysis;

    Corrective action; and

    Followup.

3-3894    OVERVIEW OF A CASS.

A.    Functions of the CASS. CASS functions as an air carrier maintenance program management tool that includes continuous and methodical monitoring and evaluation of an air carrier maintenance program. An air carrier’s CASS uses a continuous, system-safety-based, closed-loop cycle of surveillance, data collection and analysis, corrective action, and followup to continually evaluate the performance and effectiveness of the maintenance program. Through the CASS, the air carrier ensures that it is performing the right maintenance at the right time and that it produces the intended results. The CASS is one of the tools an air carrier uses to exercise operational control over maintenance activities conducted on its aircraft.

NOTE:  We cannot overstate that the CASS is a system and not a program. The primary responsibility for the CASS should be specific to an individual with the necessary authority, while coordination for the implementation of a CASS might be entrusted to a department.

1)    CASS Monitors Maintenance Program Performance. The program performance (program execution) part of the CASS ensures that everyone, including all of the air carrier’s maintenance providers, comply with the air carrier’s manual and program and with all applicable regulations.
a)    Generally, the program execution part of the CASS functions through a system of audits and investigations of operational events. The air carrier should consider each negative audit and each operational event as an indicator or symptom of a program or systemic failure. The air carrier should evaluate each one. However, depending on the results of the evaluation (risk analysis and risk assessment), every symptom or indicator may not require corrective action.
b)    The program execution part of CASS should include a continuous cycle of both scheduled and unscheduled (proactive and reactive) surveillance and investigations, data collection and analysis, corrective action, and followup surveillance.
2)    CASS Monitors Maintenance Program Effectiveness. The program effectiveness part of a CASS ensures that the maintenance program is producing the desired results. Primary indicators of the level of maintenance program effectiveness are the level of unscheduled maintenance and the rate of availability of the aircraft for use in air transportation.
a)    Generally, the program effectiveness part of the CASS functions through a system of data collection and analysis of operational data that results from operation of the aircraft. An operator should collect operational data and equipment failure data, which measures the output (results) of the maintenance program.
b)    Since one of the primary objectives of a maintenance program is to produce Airworthy aircraft for operations in air transportation, data sets such as the rate of aircraft availability, the rate of unscheduled landings, and the rate of schedule and dispatch reliability are useful for this purpose. An operator can collect this data in relation to a particular aircraft or a particular fleet.
c)    While the FAA does not mandate the specific data an operator should collect, the FAA does expect an operator to have an effective process designed to select appropriate, relevant, and useful types of collected data. This data selection process should also ensure that any data collected is useful for its intended purpose. Moreover, a periodic review of the type of collected data ensures that the collected data remains appropriate, relevant, and useful.

B.    How CASS Does It. The CASS enables an air carrier to detect and correct discrepancies in all elements of its maintenance program by proactively looking for indicators and symptoms of deficiencies and reactively looking at the results of deficiencies. CASS monitors maintenance program performance and effectiveness through a systems approach using a closed-loop system of four major activities:

    Surveillance;

    Data analysis;

    Corrective action; and

    Followup.

C.    CASS Surveillance and Analysis. The regulations require that a CASS accomplish surveillance and analysis of the air carrier maintenance program from two perspectives: performance and effectiveness. An air carrier conducts the first two activities in the CASS (surveillance and analysis) in two different ways. The primary basis for one activity is audits, while the primary basis for the other is operational data collection and analysis. The results of these two types of surveillance and analysis feed into the third and fourth basic CASS activities (corrective action and followup). Table 3-124, The Major Continuing Analysis and Surveillance System Activities, summarizes the flow of the four basic activities of a CASS.

Table 3-124.  The Major Continuing Analysis and Surveillance System Activities

VERIFY PERFORMANCE OF THE MAINTENANCE PROGRAM

VERIFY EFFECTIVENESS OF THE MAINTENANCE PROGRAM

1. Surveillance: Audit process.

    Create a risk-based audit plan.

    Perform work-in-progress audits.

    Perform transaction audits.

    Perform system audits.

    Identify hazards.

1. Surveillance: Data collection process.

    Select data sets.

    Collect operational data.

    Collect failure data.

    Identify trends, anomalies, and potential hazards.

2. Analysis: Identify hazards, accomplish risk analysis and assessment.

2. Analysis: Identify hazards, investigate adverse indicators, and accomplish risk analysis and assessment.

3. Corrective Action: Accomplish an RCA, and develop, implement, and monitor a corrective action plan (CAP), as appropriate.

4. Followup: Verify that the corrective action was effective and initiate risk-based followup surveillance planning, as appropriate.

1)    Surveillance. The air carrier conducts surveillance so it can gather information and collect data for use in the evaluation of all elements of its program (including its maintenance providers) from two different perspectives: performance and effectiveness.
a)    Surveillance to verify performance involves the use of audits, specifically work-in-progress audits used to make sure the manual and program are being followed.
b)    Surveillance to verify effectiveness involves the collection of operational data and aircraft systems failure data so that the air carrier can make conclusions about the degree of effectiveness of the maintenance program.
2)    Analysis of Data. Data analysis is the identification of system deficiencies (hazards) in an air carrier’s maintenance program through analysis of the various kinds of data that the air carrier has chosen to collect. Data analysis is also used to verify an acceptable level of program performance or effectiveness.
a)    The performance (program execution) analysis function of the CASS is carried out through the analysis of data collected during the accomplishment of audits and investigations. These audits and investigations examine the actual accomplishment of the activities and tasks of a maintenance program element relative to the standard (i.e., the air carrier manual and the maintenance program). The accomplishment of audits and analysis of audit data serve to measure program execution.
b)    The effectiveness (intended results produced) analysis function of the CASS is carried out through the analysis of collected operational data. Collection and analysis of operational data allows the air carrier to measure the output of the maintenance program relative to its objectives.
3)    Corrective Action. CASS identifies deficiencies through analysis of the audit and operational data that it collects. However, based on the risk assessment performed during risk analysis, not all deficiencies will require corrective action. The level of risk might be of an acceptable level. For example, a number of mechanical delays or cancellations may be acceptable in the eyes of the air carrier. This can be acceptable as long as safety is not compromised.
a)    When a CASS determines a risk to be of an unacceptable level, it will employ risk controls (corrective action) to deal with an identified deficiency and the cause(s) of that discrepancy.
b)    When a CASS requires the development of a CAP, it will address the causal factor(s) and provide a solution to prevent recurrence. Within a CASS, an RCA is used to identify the central causes of an event and facilitate effective corrective actions. A CASS will implement and monitor the plan through completion.
4)    Followup. Followup is the very important function that ensures the corrective action has addressed the deficiency. The followup ensures that the corrective action accomplishes what the air carrier intended it to do and connects the closed loop back to surveillance. Based on the assessment of risk, the air carrier can perform additional surveillance and/or modify data collection processes.

D.    RM in CASS.

1)    In concert with the attributes of a good organization, personnel, and resources for a CASS should be prioritized as part of the overall risk management process (RMP). RM facilitates the balancing act between assessed risks and practical risk mitigation.
2)    RM serves to focus safety efforts on those hazards posing the greatest risks. Essentially, any methodology used to prioritize surveillance personnel and resources (as well as to formulate corrective action decisions later in the process) involves principles of RM.
3)    The following elements compose a formal RMP:

    Identifying hazards;

    Analyzing risk;

    Assessing risk; and

    Controlling risk.

NOTE:  The flowchart in Figure 3-131B, Overview of the Risk Management Process, summarizes an overview of the RMP. The elements of an RMP encompass the four major CASS activities (Table 3‑124). You can find a detailed description of the RMP in paragraph 3-3898.

Figure 3-131B.  Overview of the Risk Management Process

Figure 3-131B. Overview of the Risk Management Process

E.    CASS in an Air Carrier’s Operation. An air carrier should tailor its CASS to its individual operation. Therefore, to a large degree, what the CASS looks like will depend on the design of the maintenance organization and the size, complexity, and level of flight operations of that air carrier.

1)    The basic CASS functions are always the same, but the personnel who carry out each function and the manner in which the functions are carried out will be different from one air carrier to another. For example, an air carrier with a high level of daily aircraft utilization and a very large fleet of many different kinds of aircraft may have a separate department dedicated to performing CASS activities. On the other hand, an air carrier with a fleet of 25 aircraft, operating seasonally or weekly, may find it more efficient to use its quality assurance (QA) department to perform CASS activities. An on-demand operator with few employees and one or two aircraft having an average annual utilization of less than 1,000 hours may contract most of its CASS activities.
2)    Regardless of the air carrier’s size and level of flight operations, a well-structured CASS helps an air carrier exercise operational control over maintenance activities. This involves taking a systems approach to enhancing safety and eliminating deficiencies as well as systematically determining the level of performance and effectiveness of its maintenance program. This is a key to achieving operations with the highest possible degree of safety as well as a very high degree of efficiency.

F.    What CASS Examines. A CASS monitors all 10 elements of the air carrier’s maintenance program. A CASS accounts for the consequences of various internal and external influences on the maintenance program. The following are examples of some, but not all, of the items within each element that a CASS looks at. You should note that all of these items are examined by the surveillance of the maintenance program performance function of CASS; this is accomplished through audits. However, in addition to real-time events such as accidents/incidents, CASS will address effectiveness discrepancies identified through the collection and analysis of operational data (i.e., RCA).

1)    Airworthiness Responsibility.
a)    Air carriers are primarily responsible for the performance of maintenance, including work done by maintenance providers on their aircraft. All maintenance, including work done by outside persons, must be done in accordance with the air carrier’s maintenance program and maintenance manual.
b)    An air carrier certificate makes the certificate holder a maintenance entity. Each person who accomplishes maintenance on a certificate holder’s aircraft accomplishes it on the behalf of the certificate holder as an agent for the certificate holder. Consistent with the privileges and limitations of its air carrier certificate, air carriers, through their maintenance organization, are responsible for executing operational control over maintenance activities that anyone accomplishes on its aircraft. It is a requirement for air carriers to determine when maintenance is required, what maintenance is required, accomplishing the maintenance, determining if the maintenance was done satisfactorily, and approving its aircraft for return to service. Consistent with the regulations, an air carrier certificate holder may not delegate this responsibility to persons used by the certificate holder for any maintenance, preventive maintenance, or alterations.
2)    Maintenance Manuals. CASS ensures that:
a)    The content of all manuals, including maintenance manuals and technical content, is the responsibility of the air carrier. The basis of the manuals may be the Original Equipment Manufacturer (OEM) manuals or other information, but it is a requirement for the air carrier to use its own manual, not the OEM manuals.
b)    Manuals, publications, and forms are useable, current, correct, and readily available to all personnel required to use them.
c)    Each person required to comply with the air carrier’s manual has access to it during the performance of normal duties.
3)    Maintenance Organization.
a)    Consistent with the responsibility described above, air carriers must have a maintenance organization that is able to effectively exercise and maintain operational control over all persons performing, supervising, managing, and amending the maintenance program. The maintenance organization must be able to manage and guide its maintenance personnel and provide the direction necessary to achieve overall maintenance program objectives.
b)    The individual with overall maintenance program authority and responsibility is the Director of Maintenance (DOM), who functions as the accountable manager for the maintenance program. The organization must clearly identify this individual within the organization and the individual must be qualified in accordance with 14 CFR part 119, 119.65 and 119.67(c) or 119.69 and 119.71(e), as appropriate. While retaining overall authority and responsibility, the accountable manager may delegate direct responsibility for elements of the program as appropriate for the size and structure of the maintenance organization.
c)    The air carrier manual must contain a chart or a description of the maintenance organization showing clear authority and responsibility, including delegated responsibility, for the overall maintenance program and all of its elements and functions. The regulations require the air carrier to include a description in its maintenance manual of the duties and responsibilities for each position in the organization so that there is not a fragmented organizational system with high risk of confusion over who is responsible for a given task.
d)    In order to be effective, an adequate maintenance organization must be able to demonstrate the following four organizational duties:

1.    The duty to define the environment within which individuals conduct their tasks.

2.    The duty to define the policies and procedures that individuals must follow and respect.

3.    The duty to allocate the resources that individuals need in order to achieve safety and production goals.

4.    The duty to investigate system failures and take all needed remedial action to avoid a repetition.

e)    A maintenance organization will not be successful if it permits the following failures to occur.

1.    Failure to understand the effect of people on safety and reliability of aircraft maintenance operations.

2.    Failure to organize its employees’ work.

3.    Failure to monitor its employees’ work effectively.

4.    Failure to implement corrective actions.

f)    The performance of the RII function(s) must be organizationally separated from the performance of the other maintenance (including inspection), preventive maintenance, and alteration functions. This organizational separation must be below the level of the individual who has primary responsibility for the RII function, other maintenance, preventive maintenance, and alterations functions. In simple terms, this means that the part of the maintenance organization that accomplishes the maintenance (including inspection), preventive maintenance, and alterations function cannot be the same part of the maintenance organization that accomplishes your RII function.
4)    Maintenance Schedule. The maintenance schedule sets out the appropriate item, task, and interval of the air carrier’s scheduled maintenance effort. The FAA expects the air carrier’s maintenance schedule to be task based and appropriately modified in accordance with the CASS data collection and analysis findings. The air carrier accomplishes the initial selection and the continuous validation of each scheduled maintenance task and its associated interval according to well-defined criteria throughout the service life of the item, system, or structure. The maintenance schedule is proactive and designed to permit the item, system, or structure to do what it is designed to do. Notwithstanding design issues, the level of unscheduled maintenance is a primary indicator of the level of effectiveness of the maintenance schedule.
5)    RIIs.
a)    The air carrier has specific procedures, standards, and limits necessary for the acceptance or rejection of each RII and for periodic inspection and calibration of precision tools, measuring devices, and test equipment. You should note that the OEM’s manuals and procedures do not contain RII procedures, standards, and limits as the air carrier must develop and document these.
b)    Personnel authorized to accomplish RII inspections receive proper training and qualification for each RII task that they receive the authorization to perform.
c)    Designated RII inspectors who perform an item of work do not perform the required inspection on that item.
d)    The maintenance organization separates the performance of the required inspection functions from the other maintenance, preventive maintenance, and alteration functions.
e)    The manual contains procedures to ensure that only supervisory personnel of an inspection unit, or the person who has overall responsibility for the RII function as well as the other maintenance, preventive maintenance, alteration functions, may countermand the decision of any RII inspector regarding an RII.
6)    Contract Maintenance. Vendors and suppliers have the qualifications and provide services and products according to the air carrier’s maintenance program and manual. There should be no difference between the way work is done by air carrier personnel or by the air carrier’s maintenance providers.
7)    Personnel Training.
a)    The air carrier must have a means to determine that all maintenance personnel, including maintenance provider personnel, are competent to accomplish their duties.
b)    The air carrier has a training program for personnel (including inspection personnel and maintenance provider personnel) that determine the adequacy of accomplished maintenance.
c)    The program ensures that personnel are competent to perform their duties.
8)    Accomplishment and Approval of Maintenance.
a)    Maintenance facilities and equipment, as well as the air carrier’s maintenance providers’ facilities and equipment, are adequate to perform the maintenance. Other than scope and location, there should be no difference in the standards for facilities and equipment between the air carrier and its maintenance providers.
b)    Maintenance providers properly store, dispense, identify, and handle parts and components.
c)    Maintenance providers properly calibrate tools and equipment.
d)    Maintenance providers identify the requirements for specialized tools or training and provide training.
e)    Maintenance providers perform maintenance and alterations according to methods, standards, and techniques specified in the air carrier’s manuals.
f)    Maintenance provider personnel properly document work interruptions and deferred maintenance in shift turnover records, and accomplish them according to applicable procedures.
g)    Maintenance providers properly classify major repairs and major alterations (consistent with the 14 CFR part 1, 1.1 meaning of major alteration or repair) and accomplish them in accordance with FAA‑approved technical data.

NOTE:  During 1953, the Civil Aeronautics Administration (CAA) published a list of repairs to specific parts as well as specific types of repairs that were considered major repairs in Civil Aeronautics Manual (CAM) 18. This major repair list was later adopted, unchanged, as part of 14 CFR part 43, appendix A.

h)    If the air carrier relies exclusively on this standardized list of major repairs to make the major/minor classification, it will result in the classification of some minor repairs as major and the classification of some major repairs as minor simply because the list has not been updated to include evolving airplane design and construction techniques such as composite structures, damage-tolerant design, and the high-speed pressurized jet transport that did not exist in 1953.
i)    Appropriately certified mechanics or repairmen, who are authorized by the air carrier, execute log entries and Airworthiness Release Forms.
j)    Maintenance providers complete log entries and Airworthiness Release Forms according to the air carrier’s written policies and procedures.
9)    Maintenance Recordkeeping System.
a)    Generate and retain maintenance records and current status records in accordance with the air carrier’s manual procedures.
b)    Maintenance records and current status records are complete and correct.
c)    Airworthiness Directives (AD) are appropriately evaluated, accomplished, and tracked.
d)    Identify life-limited parts and track the current time in service status.
10)    CASS.
a)    CASS has four major activities that ensure, with a system-oriented, structured approach, that all elements of the air carrier maintenance program are properly executed and are consistently effective by design rather than by chance.
b)    Senior management reviews CASS issues on a regularly scheduled basis. Meetings of CASS or maintenance management committees or boards are also held on a regular basis to discuss findings, analysis, and the progress of corrective actions. These meetings may address events, as well as statistical data and trends.

3-3895    PREREQUISITES AND COORDINATION REQUIREMENTS.

A.    Prerequisites. Knowledge of the requirements of parts 119, 121, and/or 135.

B.    Coordination. This task may require coordination between the principal maintenance inspector (PMI) and the principal avionics inspector (PAI).

3-3896    REFERENCES, FORMS, AND JOB AIDS.

A.    References (current editions):

    AC 120-16, Air Carrier Maintenance Programs.

    AC 120-79, Developing and Implementing an Air Carrier Continuous Analysis and Surveillance System.

    AC 120-92, Safety Management Systems for Aviation Service Providers.

    FAA Order 8000.369, Safety Management System.

    FAA Order 8040.4, Safety Risk Management Policy.

B.    Forms. None.

C.    Job Aids. None.

3-3897    VERIFY THE CASS ORGANIZATIONAL STRUCTURE.

A.    Identifying CASS Organization Positions. Identify the positions within the company that have authority and responsibility for the CASS. The definitions below have meaning within the context of an air carrier’s organization. Consistent with existing regulations, there should be a chart or description of the CASS organization in the air carrier’s manual.

1)    Authority is a permission; it is the power to create or modify fundamental policy or procedures without higher level review or approval. Authority also means the power to accomplish a function, as well as the power to assign responsibility for carrying out the various functions of the maintenance program. The individual with authority for the CASS may design or change the CASS without having to seek approval from a higher level of management. CASS procedures should include a process to modify or revise the CASS.
2)    Responsibility is an obligation that comes with accountability to ensure the successful completion of tasks and functions in accordance with applicable policies, procedures, and standards. This work may be accomplished directly by the individual with the responsibility, or the responsibility for the work may be delegated. The individual with responsibility for the CASS has the obligation to carry out the functions of the CASS, including overseeing and managing any personnel who are assigned CASS functions and duties. Note that for smaller organizations where personnel share duties and may only carry out CASS functions part time, this oversight and management responsibility relates only to those part-time tasks.

B.    Authority and Responsibility.

1)    An individual or position within the maintenance organization should have authority for the CASS, and an individual or position within the maintenance organization should have overall responsibility for managing and implementing the CASS. An individual may have both responsibility and authority for the CASS.
2)    That individual might also have responsibility for other functions as well as the CASS. It is common for the individual with responsibility for CASS functions to delegate some or much of this work to others within the organization, depending on the size and staffing of the operator.
3)    What the FAA expects is clear responsibility for the overall CASS functions so that there is not a fragmented system with a high risk of confusion over who is responsible for executing a given task or function.
4)    The potential exists for a conflict of interest between personnel managing daily operations within the carrier and those who serve in an oversight role. Personnel with CASS responsibilities and duties should be as independent as possible from the day-to-day operations of the maintenance program. Theoretically, outside personnel contracted to perform such work for the air carrier conduct the most independent, objective audits.
5)    Air carrier personnel who are conducting audits should work in separate departments from the departments performing the actual maintenance activities that are being audited. However, this is not necessarily feasible for small operators. At small operators, personnel performing CASS functions, particularly audits, may consist of one or more of the following:
a)    Borrowed personnel from other shops or departments. The operator’s procedures should include ways to avoid having these individuals assigned to audit areas where they normally work.
b)    The company owner or chief executive officer (CEO), particularly if there are no other employees and the CASS audits are focused on outside vendors and maintenance providers because all or most of the actual inspection and maintenance work is accomplished through contracts.
c)    Outside resources contracted to perform audits and analysis for the company.
d)    Others deemed qualified by the operator to provide the operator with an independent objective audit, operational data collection, and analysis services that fulfill the requirements of a CASS.

3-3898    VERIFY THE CASS FUNCTIONS CONCERNING RM.

A.    RMP. In an effective CASS, you should be able to identify the principles of the systematic RMP that:

    Establish a plan, including the scope of the process and priorities (e.g., detect and prevent noncompliance);

    Specify the areas of concern for surveillance and analysis (personnel, maintenance and inspection programs and organizations, operations, aircraft, facilities, systems);

    Identify hazards or potential threats to the operation;

    Determine how likely such hazards are to be realized and actually cause harm;

    Determine the severity of the consequences if the hazard is realized;

    Express a combination of the likelihood and severity of harm as risk; and

    Evaluate the appropriate response to the identified risk.

B.    Hazard Identification. The CASS should detect and correct deficiencies in all 10 elements of its maintenance program by proactively looking for indicators and symptoms of deficiencies and reactively looking at the results of deficiencies.

1)    The CASS should identify the deficiencies (hazards) within the air carrier’s maintenance program during the analysis of data. The proactive approach for identifying deficiencies involves setting surveillance priorities based on risk assessments aimed at maintaining compliance and safety in inspection and maintenance.
2)    A CASS should take into account four principal potential sources of hazards:

    Personnel (hiring, capabilities, interaction);

    Equipment (design, maintenance, logistics, technology);

    Workplace (environment, sanitation); and

    Organization (standards, procedures, controls).

3)    However, a CASS may recognize hazards through actual events such as an incident or accident (a more reactive approach). These events provide clear evidence of problems in a system and therefore provide an opportunity to investigate the event and identify the hazards putting the system at risk. In practice, both proactive processes and reactive measures can provide a valuable means of identifying hazards as a CASS should evaluate them.
4)    An air carrier CASS should include clear procedures for determining:

    Who will be responsible to perform hazard identification;

    What personnel training or qualifications will be required to participate in hazard identification;

    When to perform hazard identification;

    How to accomplish the determination of a hazard; and

    How to document the hazard.

C.    Risk Analysis and Assessment.

1)    Analysis, Likelihood, and Severity Assessment. After confirming the presence of a hazard, and keeping with the concept of managing risk to an acceptable level, risk analysis is required to assess its potential for harm or damage. Analyze, assess, and rank all identified hazards in the order of their risk potential. Risk analysis and risk assessment use a conventional breakdown of risk by its two components: likelihood of occurrence of an injurious mishap and severity of the mishap related to an identified hazard, should it occur. A common tool for risk decision making and acceptance is a risk matrix similar to those in the current edition of U.S. Military Standard (MS) MIL STD 882, Standard Practice for System Safety, and the International Civil Aviation Organization (ICAO) Safety Management Manual (SMM). Figure 3-131C, Sample Risk Matrix, shows an example of one such matrix. Operators should develop a matrix that best represents their operational environment. The matrix should be consistent throughout the operator’s organizational structure.
2)    Risk Analysis.
a)    The definitions and final construction of the matrix is left to the operator’s organization to design. Terms that are realistic for the operational environment will define each level of severity and likelihood. This ensures that each organization’s decision tools are relevant to their operations and operational environment, recognizing the extensive diversity in this area.
b)    An example of severity and likelihood definitions is shown in Table 3-124A, Sample Severity and Likelihood Criteria. Each operator’s specific definitions for severity and likelihood may be qualitative, but quantitative measures are preferable, wherever possible.

Table 3-124A.  Sample Severity and Likelihood Criteria

Severity of Consequences

Likelihood of Occurrence

Severity Level

Definition

Value

Likelihood Level

Definition

Value

Catastrophic

Equipment destroyed, multiple deaths.

5

Frequent

Likely to occur many times

5

Hazardous

Large reduction in safety margins, physical distress, or a workload such that operators cannot be relied on to perform their tasks accurately or completely. Serious injury or death to a number of people. Major equipment damage.

4

Occasional

Likely to occur sometimes

4

Major

Significant reductions in safety margins, reduction in the ability of the operator to cope with adverse operation conditions as a result of an increase in workload, or as a result of conditions impairing their efficiency. Serious incident. Injury to persons.

3

Remote

Unlikely, but possible to occur

3

Minor

Nuisance. Operating limitations. Use of emergency procedures. Minor incident.

2

Improbable

Very unlikely to occur

2

Negligible

Little consequence.

1

Extremely Improbable

Almost inconceivable that the event will occur

1

3)    Risk Assessment. In the development of its risk assessment criteria, we expect operators to develop risk acceptance procedures, including acceptance criteria and designation of authority and responsibility for RM decisionmaking. You can evaluate the acceptability of risk using a risk matrix such as the one illustrated in Figure 3‑131C. The example matrix shows three areas of acceptability. Risk matrices may be color-coded: unacceptable (red), acceptable (green), and acceptable with mitigation (yellow).
a)    Unacceptable (red). Where combinations of severity and likelihood cause risk to fall into the red area, the risk would be assessed as unacceptable and further work would be required to design an intervention to eliminate that associated hazard or to control the factors that lead to higher risk likelihood or severity.
b)    Acceptable (green). Where the assessed risk falls into the green area, it may be accepted without further action. The objective of RM should always be to provide an appropriate response to an identified risk.
c)    Acceptable with mitigation (yellow). Where the risk assessment falls into the yellow area, the risk may be accepted under defined conditions of mitigation. An example of this situation would be an assessment of the impact of a nonoperational aircraft component for inclusion on a minimum equipment list (MEL). Defining an operational (“O”) or maintenance (“M”) procedure in the MEL would constitute a mitigating action that could make an otherwise unacceptable risk acceptable, as long as the defined procedure was implemented.

Figure 3-131C.  Sample Risk Matrix

Figure 3-131C. Sample Risk Matrix

4)    Assessing the Risk of a Hazard. CASS procedures for analyzing and assessing the risk of a hazard should include:

    Who will be responsible to perform risk analysis and assessment;

    Is the responsible person trained to perform risk analysis and assessment;

    Specific set of criteria for how the determination will be made (i.e., detailed written procedures and a matrix);

    Who will have the authority to approve the assessment; and

    What levels of review, if any, will be performed.

5)    Risk Mitigation. The end result of the risk analysis and assessment is a decision: the risk is acceptable, acceptable with mitigation, or unacceptable. A risk control measure that is developed and implemented should be directly related to the final decision of the risk assessment.

3-3899    VERIFY THE CASS FUNCTIONS CONCERNING THE PERFORMANCE OF THE MAINTENANCE PROGRAM.

A.    Surveillance of the Performance of Maintenance Programs.

1)    The main tool for surveying whether the operator and its maintenance providers are properly performing the maintenance program is audits. An audit is a formal examination of the activities of an air carrier’s or maintenance provider’s departments or areas as compared to a standard, which is the air carrier’s program as written in its manual. The air carrier’s audits should be designed to measure an air carrier’s and maintenance provider’s compliance with their maintenance program requirements. The maintenance program itself must ensure that the maintenance provider accomplishes all maintenance activities in accordance with the processes and procedures of the maintenance program ( 121.367 and 135.425).
2)    There are four types of audits usually used by air carriers. Generally, the differences are who accomplishes the audits and who the audit looks at.
a)    Internal audits are performed by air carrier audit personnel on air carrier activities.
b)    External audits are performed by air carrier personnel on activities of outside entities or maintenance providers.
c)    A specific individual or position within a department, shop, or maintenance provider performs self‑audits “in house.”
d)    Third parties (the FAA, Coordinating Agencies for Supplier’s Evaluation (C.A.S.E.), Department of Defense (DOD), etc.) perform third-party audits on the air carrier or maintenance provider.
3)    There are at least three audit methods used by air carriers. Generally, the differences are the audit objective and what the audit looks at.
a)    The work-in-progress method audit is the primary type of method that we expect the air carrier to use. The purpose of these audits is to determine if the worker is following the manual. This is a requirement of  121.367(a) and 135.425(a). A negative finding is a program deficiency under the CASS rules and must be addressed. This follows the plain language meaning of the CASS regulation (i.e., the performance of the maintenance program; are they doing what the program requires?).
b)    The transaction method is used primarily for reviewing records and serves to see if the maintenance program standard for the record form, record procedures, record completion, and record accuracy standards of the program are being achieved.
4)    The systems method is a high-level, comprehensive, and documented examination of all of the activities, records, processes, and other elements of an air carrier or maintenance provider’s various systems, to determine their conformity with the requirements of a standard such as the air carrier’s maintenance program. These audits can also identify various latent faults in the air carrier’s or maintenance provider’s overall systems. These audits are usually, but not always, accomplished by a professional audit agency and result in a written and detailed report that is used by senior management to make corrections to the systems.
5)    The operator’s auditing process should have written procedures that include the scheduling of audits. The CASS must address both internal and external audits.

NOTE:  If your operator assigns an onsite representative at a maintenance provider, you should ensure that the air carrier’s CASS procedures cover this individual(s)’ surveillance activities. These are classed as work-in-progress audits. Typically, these individuals will observe a noncompliance with the air carrier’s manual and ensure that the maintenance provider corrects the noncompliance usually within that same day. These events must be recorded and input to the air carrier’s CASS documentation system for analysis and evaluation. Otherwise, a pattern of noncompliance with the air carrier’s program and manual may go undiscovered by air carrier and maintenance provider management personnel.

6)    CASS procedures should include a risk-based methodology for determining priorities and for establishing and adjusting audit cycles (for example, 12-, 18-, 24-, or 36-month cycles) so that resources are focused on the most pressing issues. You should note that the RMP may show that a department or maintenance provider self‑audit is applicable and effective.
7)    Although the majority of the inputs to this process would be generated internally, one additional input may be the results of outside audits of the operator or its vendors conducted by entities other than the operator. For example, the results of audits or inspections conducted by the FAA or the DOD may be useful by providing an operator with:

    Specific findings requiring an RCA and possible corrective action; and

    Information useful in focusing the operator’s own audits and operational data collection.

B.    Scheduling Audits. The operator may approach this initial scheduling task in many different ways, ranging from resource allocation based on company experience and very basic analysis to the use of a sophisticated, software‑supported risk analysis process. Within this range of possible methodologies, expect the operator’s CASS audit scheduling procedures to contain processes to systematically make those decisions that are compatible with the size and complexity of its operations.

C.    Safety and Operational Objectives. Encourage your operator to make this process as structured as possible. The operator should place priority first on safety and regulatory compliance and second on issues of operational efficiency. However, an effective CASS will meet all of these objectives.

1)    To identify the areas to audit and to set priorities, the CASS process should include consideration of factors in outside reports. These factors could include inspections, reports, special studies, or audits conducted by outside entities such as the FAA, DOD, Department of Transportation (DOT), Office of the Inspector General (OIG), or National Transportation Safety Board (NTSB). Outside reports may address:

    Information specific to the operator or its vendors;

    Information related to the industry as a whole and of interest to the operator; or

    Information about an accident, incident, procedure/process, or equipment type that is relevant.

2)    The operator should equip CASS auditors with checklists to ensure consistency and completeness of audits. The accountable manager for the CASS should ensure that the checklists are updated as needed. The checklists should be written in a manner that evaluates compliance with the 10 elements of the maintenance program. An auditor should also be permitted a level of flexibility to ask questions not contained in the checklist if he or she finds an area that requires further investigation.
3)    An operator’s procedures should include identification of all areas that need to be audited, along with a process for updating this list. The following list presents examples of areas operators should consider for routine audit. A CASS audit should verify that:
a)    Manuals, publications, and forms (paper and electronic versions) are useable, up to date, accurate, and accessible to users when they are performing assigned duties.
b)    Maintenance and alterations are performed according to the methods, standards, and techniques specified in the operator’s manuals, including ensuring that major repairs and alterations are properly classified and accomplished consistent with technical data approved by the Administrator.
c)    The maintenance provider properly stores, dispenses, identifies, and handles parts and components.
d)    ADs are appropriately evaluated, accomplished, and tracked.
e)    Aircraft modifications that have been installed as a result of AD requirements have not been removed or modified by subsequent repairs, alterations, or other modifications.
f)    Maintenance providers generate maintenance records in accordance with manual procedures and are complete and correct.
g)    RIIs are identified and addressed according to the operator’s procedures.
h)    Authorized individuals execute 121.709 and 135.443 Airworthiness Release Forms and log entries according to the operator’s procedures.
i)    Maintenance providers accomplish shift turnover records, work interruptions, and deferred maintenance according to applicable procedures.
j)    Maintenance facilities and equipment, including base and line stations and contract maintenance providers’ facilities, are adequate for the work that is to be done.
k)    Personnel, including those of contract maintenance providers, are qualified and competent to accomplish their duties.
l)    Tools and equipment are properly calibrated.
m)    Requirements for specialized tools or training are met, such as for nondestructive testing, Category II/III maintenance, and run-up/taxi.
n)    Computer programs for the maintenance program are used in accordance with specifications.
o)    Maintenance providers, vendors, and suppliers provide services and products according to the operator’s policies and procedures.
p)    Each aircraft released to service is Airworthy.
4)    CASS audits should be primarily proactive, searching out potential problem areas before they can result in undesirable events. However, CASS procedures should also address how to direct unscheduled audits in response to events or a series of events. For example, rejected takeoffs, unscheduled landings, in-flight shut downs (IFSD), accidents, or incidents may indicate the need for special audits or surveillance under a CASS.
5)    One of the primary purposes of a CASS is to detect and analyze trends for indications of program weaknesses or deficiencies. For example, CASS auditors would not necessarily audit a single maintenance-related rejected takeoff, although the CASS would investigate the event as part of the reactive function. A CASS would, however, consider whether that instance indicated a need to focus audits on a particular area from the trending proactive point of view.
6)    Auditors and analysts should maintain informal lines of communication with personnel in the other departments so that maintenance personnel can discuss concerns they may have. Through this informal communications process, the operator can learn about potential hazards in the system. For example, the operator may learn about an event that could have occurred but, because of some intervention, did not. This event would be known to shop personnel but is otherwise difficult or impossible to detect in routine audits. With informal lines of communication open to shop personnel, a CASS may detect this near-event. You should ensure that the operator’s CASS procedures address how to encourage this type of communication and interaction.

D.    Analysis of Audits.

1)    Audit results should undergo an analysis to identify a deficiency or a real/potential hazard in any aspect or element of the maintenance program. The objective of the audit analysis is to allow the operator to address the problem in such a way as to avoid recurrence of the deficiencies. To the extent possible, the operator should set forth the analysis process in the CASS documentation.
2)    The analysis tells operators where to allocate resources and helps them understand what was identified. RM principles should be incorporated into the analysis process. The analysis will help CASS personnel determine the level of priority that the issue merits and what type of additional technical expertise may be required to complete the RCA and evaluate corrective action options.
3)    The analysis process should be as objective as possible to avoid any tendency to promote individual or commercial interests. The more thorough the analysis, the greater the likelihood the operator will uncover why the system deficiency occurred and how the organization can respond definitively.
4)    The analysis process starts during the audit itself, because auditors must collect information for later analysis. If a CASS is to uncover a procedural weakness, for example, information about the procedure must be collected. This should be factual and objective information that does not contain premature judgment about a root cause.
a)    Auditors and analysts should be encouraged to be inquisitive and think in terms of “what if?” so that the CASS functions proactively, detecting problem areas or trends before they can lead to an accident, incident, or infraction of regulations.
b)    For example, what if event X occurred in conjunction with observed condition Y? While audits are designed mainly to verify that an operator is performing maintenance in accordance with its manual, the regulations, and applicable requirements, auditors and analysts should also be alert for systemic deficiencies.
c)    There may be procedures in the manual that are correctly followed but that have become outdated, conflict with other manual procedures, or for some other reason are in need of a change. This assessment of the system design should also place priority on finding the systemic or root cause of a program deficiency over seeking to assign personal blame at any level of the organization.
d)    This inquisitive approach should spread throughout the CASS organization, from determining audit priorities and scheduling through auditing and analyzing, including monitoring and evaluating corrective actions. The end result for this system assessment is seeking out the identification of new or potential hazards.
5)    The audit analysis process is typically more qualitative than the operational data analysis. Operators may find it useful to manage the collected data through database or quantitative applications. Be aware that this approach does not have to be complicated or costly. The level of formality and sophistication should match the operator’s conditions.

3-3900    VERIFY THE CASS FUNCTIONS CONCERNING THE EFFECTIVENESS OF THE MAINTENANCE PROGRAM.

A.    Surveillance of the Effectiveness of the Maintenance Program.

1)    The main tool for assessing whether the air carrier’s maintenance program is effective is the collection of operational data (data resulting from airplane operations). This way, the output of the maintenance program can be measured. However, not all operational data or information may be useful for determining maintenance program effectiveness.

NOTE:  Consistent with the “effectiveness” part of the CASS regulation, the primary type of effectiveness surveillance that the operator should be accomplishing is the collection of operational data.

2)    A primary goal of air carrier maintenance programs is to ensure that each air carrier aircraft released to service is Airworthy, as well as to provide the maximum level of availability for operations in air transportation. However, in order to consistently reach these goals, the air carrier must have a means of determining if the maintenance program is producing the intended results.
3)    Generally speaking, at the end product level, an indicator of the effectiveness of the maintenance program is the amount of time an air carrier aircraft is not available for operations in air transportation due to issues controlled by the maintenance program. This particular effectiveness indicator can be broken down into fleet availability or individual aircraft availability, and broken down still further to the reliability of aircraft systems, subsystems, and components. In simple terms, the amount of unscheduled maintenance that reduces the availability of an air carrier aircraft for operations in air transportation is a primary indicator of whether or not the maintenance program is producing its intended results.

B.    Collecting Operational Data.

1)    Air carrier operational data collection systems under the CASS effectiveness activity are critical to the air carrier’s ability to determine the level of effectiveness of its maintenance program. These systems should have capabilities for collecting, storing, managing, and retrieving all types of operational data that the air carrier can use to help it determine the level of maintenance program effectiveness.
2)    Current systems that collect information regarding the status of aircraft structures, systems, and engines have a wide variance ranging from simple paper systems administered manually by air carrier personnel to the very sophisticated, complex, and automatic, real-time data collection systems that use information collected from sensors embedded all over the aircraft. As of this writing, there are operational data collection systems planned that will manage, and sometimes repair, system faults through automatic computer activity. Newer transport category aircraft are delivered with sophisticated electronic, propulsion, flight control, and structural monitoring and data acquisition systems.
3)    In recent years, an increased emphasis has been placed on using these automatic data collection capabilities, in conjunction with emerging sensor, data processing, and systems status monitoring and assessment technologies, to realize real-time conditions of aircraft components. While most of these automatic systems are not well defined, the goal is to use real-time flight data to detect system flaws, defects, or abnormal operating conditions early enough to allow timely intervention.
4)    The key thing to remember is that these new maintenance management systems are part of the continuing evolution of maintenance. They should be characterized as a new and different way of doing maintenance, not a means of eliminating maintenance. As maintenance is still being accomplished, these systems do not eliminate maintenance actions. They may, however, eliminate some scheduled maintenance activities.
5)    Contact the Aircraft Maintenance Division (AFS-300) for specific guidance before you authorize an air carrier to use an automatic data collection system or automatic Aircraft Systems Health Monitoring and Management System.

C.    Operational Data Procedures.

1)    The operator should have written procedures to guide its operational data collection process. CASS procedures should include a risk-based methodology for determining the type and frequency of operational data collection so that resources focus on the most revealing data, with regard to maintenance program effectiveness. An air carrier CASS should include clear procedures for determining:

    What operational data to collect;

    Who will collect it;

    How to collect it;

    When to collect it; and

    What to do with it.

2)    Operational data can be divided into routine or nonroutine data collection and analysis. The routine data element uses a proactive data collection and analysis process that seeks to identify indicators of maintenance program ineffectiveness before they can progress to a functional failure that results in a reduction in aircraft availability. Some examples are:

    Aircraft logbook information detailing unscheduled maintenance, including maintenance deferred in accordance with the MEL/Configuration Deviation List (CDL);

    “Chronic” aircraft systems that have repeat write-ups within a specified time period (e.g., 10 to 15 days);

    Corrosion prevention and control program findings;

    Engine condition trend monitoring data;

    Individual item failure rates; and

    Mechanical Reliability Reports (MRR), Mechanical Interruption Summaries (MIS), and similar data.

3)    The nonroutine operational data element is a reactive data collection and analysis process that seeks to identify indicators of maintenance program ineffectiveness after an undesirable event has occurred. Some examples are:

    Accidents and incidents;

    In-flight engine and propeller separations and uncontained engine failures;

    In-flight engine shutdowns;

    High-load events;

    Flight delays and cancellations related to mechanical issues;

    Rejected takeoffs;

    Unscheduled parts replacement or unscheduled maintenance;

    Unscheduled landings due to mechanical issues;

    Lightning strikes; and

    Hard landings.

4)    As with reactive audit surveillance, a CASS generally approaches problems from the analytical, systems perspective. For example, in response to one or more rejected takeoffs, a CASS might focus the operational data collection and analysis to determine if a pattern in rejected takeoffs was evident, or a CASS might examine other types of data in relation to the rejected takeoff situation.
5)    The above data sets are presented only as examples. Although the data sets are oriented toward equipment, this area of a CASS may also collect other types of data, such as information on the different types of maintenance errors experienced by the operator.
6)    The operator’s CASS documentation should include a means of identifying data that is relevant and useful for that operator to use in monitoring the effectiveness of its specific maintenance program. The operator should periodically review and reevaluate the usefulness of the data it collects and analyzes to accomplish this portion of the CASS.

D.    Analysis of Operational Data.

1)    Provide analysts with an understanding of the potential significance of each data set and how to process the data to understand its significance. This may require statistical analysis to compare the frequency of certain events, equipment failures with a determined norm, or qualitative analysis to evaluate reports of certain types of events.

NOTE:  This process is not necessarily the same as what would be used in an FAA-approved reliability program.

2)    Emphasize that the analysis of operational data should consider root causes of negative trends or anomalies. This preliminary RCA, including human factors, may require collaboration with technical personnel in the affected areas, specialists in engineering and reliability departments, or the OEM.
3)    Delineate the roles of the CASS analysts as well as other departments or personnel in the analysis of operational data.
4)    Some operators select a system that uses alerts or warnings if results of the analysis exceed certain predetermined parameters. A CASS should not rely completely on such alerts to the exclusion of analysts’ judgment. The FAA’s expectation of a CASS in this regard is that the operator has a complete, written procedure to review and analyze the operational data collected and to determine when further review is necessary.
5)    While the surveillance and analysis steps differ for the verification of the performance of the maintenance program versus verification of the effectiveness of the program, the process merges when responding to CASS findings and providing a corrective action, as necessary.
6)    Results from the two requirements of CASS (performance and effectiveness) identify potential deficiencies (hazards) in the maintenance program. In responding to these findings and analyses, the objective of a CASS is to determine the root causes of program deficiencies and address them appropriately, regardless of the perspective from which the deficiencies are found. Note that the discussion is focused on a CASS function, not an organization. For a given operator, that function might be performed by more than one organization.
7)    Generally, the area responsible for surveillance will present their results to the technical or production area of the operator with a preliminary analysis of the collected information and, in some cases, possible underlying causes of the problem. Personnel in technical or production areas usually complete the RCA and develop proposed corrective action alternatives.

3-3901    VERIFY THE CASS FUNCTIONS CONCERNING THE CORRECTIVE ACTION PROCESS. A corrective action process is the process of interdependent activities that traces the symptom(s) of a problem to its cause, produces solutions in a timely manner for preventing the recurrence of the problem, and implements the changes. The activities within the corrective action process are:

    RCA;

    Development of corrective action;

    Implementation of corrective action; and

    Monitoring a corrective action.

A.    RCA.

1)    RCA applies to both audit findings and analysis of results and trends in the operational data. For example, either audits or operational data analysis may point to maintenance errors caused by inadequate training. Analysis should not stop with simply determining which mechanics received inadequate training and then retraining the mechanics. Rather, the analysis should determine why the training breach occurred and consider areas in management, communications, scheduling, or training program design that may be involved.
2)    The key is to have a CASS structure that addresses the basic disciplines and elements involved in finding and correcting program deficiencies. The CASS procedures should note that in performing an RCA, all relevant areas should be considered, including the role of senior management in setting appropriate policies, procedures, and an environment of communication.
3)    Following a systems approach, an RCA treats errors as defects in the system rather than in an individual. RCA looks beyond the symptom to find the organizational defect that permitted an error to occur, to correct the fundamental problem, and to prevent recurrence. The more thorough the analysis, the greater the likelihood that the operator will uncover why the system deficiency occurred and how the organization can respond definitively.
4)    The process starts during the audit itself, because auditors must collect information conducive to later analysis. If a CASS is to uncover a procedural weakness, for example, information about the procedure must be collected. This should be factual and objective information, not a premature judgment about the root cause. RCA is a key to any complete CASS, even though procedures may vary in complexity from operator to operator.
5)    Regarding the thoroughness of the analysis, the principles and considerations of an RCA are closely related to those of risk assessment. Both processes do not simply consider the person involved in an issue (e.g., the mechanic made a mistake) but all aspects of the organization within which that person works. This approach includes the premise that human error is a consequence of the system rather than a deliberate action of an individual and that proactive measures and continuous reform of different aspects of the processes and organization can address latent conditions in the system and increase the system’s resistance to operational hazards. The term “latent condition” refers to flawed procedures or organizational characteristics that are capable of creating hazards if the right conditions or actions occur.
6)    Ensure that the operator’s procedures or corporate culture do not advocate the blame culture. The blame culture can have a significant negative effect on safe operations. Terminating the individual who has the blame assigned is usually not consistent with an effective RCA.
a)    Operators that adopt the blame culture:

    Fix the blame and move on;

    Focus on the individual(s) who made the error;

    Stop short of identifying systemic problems and root causes;

    Never fix the problem; and

    Allow mishaps/mistakes to recur.

b)    Written CASS procedures for an RCA should address the following questions:

    Do documented procedures lay out when the RCA process will be used?

    What events would trigger the RCA process?

    Do procedures describe who will perform an RCA?

    Are CASS auditors and other CASS personnel trained in RCA?

    Are personnel who are developing a corrective action trained in RCA?

    Do the personnel performing an RCA have a direct knowledge of the process deficiency?

    Are there procedures specific to how an RCA will apply equally to the performance and effectiveness of CASS?

    Does management “support” RCA?

    How does an RCA address findings from outside sources?

B.    Areas of Emphasis for an RCA.

1)    Systems analysis plays an increasingly important role in a CASS because of the increasing complexity and variety of operations, equipment, and organizations. Systems analysis emphasizes a coordinated approach to an enterprise in terms of integrated networks of people and other resources performing activities that accomplish some mission or goal in a prescribed environment.
2)    Focusing on a systems approach for identifying why a deficiency occurred, personnel working on the proposed corrective action(s) should ensure that they evaluate the characteristics within the design of a process. This approach recognizes the wide range of interrelated issues that may be associated with a problem in the system, such as management policies, communications, and pilot technique, in addition to the maintenance activities themselves.
3)    Human factors analysis looks at how humans communicate and perform in the work environment and then seeks to incorporate that knowledge into the design of equipment, processes, and organizations. This enhances safety and maximizes the human contribution, partly by designing systems to anticipate the inevitability of human error. Human factors that audit checklists can address include basic issues, such as whether there is adequate lighting for mechanics and inspectors to perform their work and whether schedules permit personnel to get proper rest. The discipline addresses a wider range of issues affecting how people interface with technology and the operational system, including:

    Human physiology;

    How people learn and perceive;

    Equipment, technology, and documentation; and

    Workplace.

4)    Knowledge gained from human factors analysis can:

    Help avoid maintenance errors;

    Ensure that personnel skill sets match task requirements;

    Ensure that skill sets are maintained and improved; and

    Enhance the work environment.

a)    This knowledge can help CASS analysts perform an RCA.
b)    Continuing with the previous example of inadequate training, with insufficient awareness of human factors issues, operators may trace a maintenance error to a mechanic or technician who appears to have received insufficient training for the task and determine that the solution is more technical training. However, further analysis may reveal that there are contributing flaws in equipment design, job cards, manuals, the work environment, or organizational procedures, such as shift turnover, that more training will not satisfactorily overcome. Or, it may turn out that a different kind of training, perhaps involving decisionmaking skills, is called for.
5)    As of this writing, the FAA is deeply involved in cooperative efforts with industry and academia in promoting human factors in aviation. This field is rapidly evolving, particularly in its application to aviation maintenance. According to a study conducted for the FAA, which cited Boeing research, maintenance error contributes to a significant portion of air carrier accidents, with shift turnover errors and work interruptions standing out as leading underlying causes. Based on the growing importance of human factors and information available to industry, the FAA expects that operators will apply concepts of human factors to their CASS surveillance and analysis.
6)    CASS surveillance also should ensure that an RCA considers human factors. Those personnel designated to respond to events such as rejected takeoffs should also include human factors as part of their investigation of individual events. Otherwise, data reviewed in a CASS may be incomplete.
Indicates new/changed information.
7)    A strong CASS will focus on safety issues and support a just culture. Although the function of the CASS is to identify and correct deficiencies in the maintenance program, an effective RCA may find inadvertent human errors that can be corrected with compliance tools, or unacceptable behavior that may require company disciplinary or FAA enforcement action. Safety-conscious and compliant individuals will naturally report, cooperate, and address mistakes, design flaws, and/or systemic issues within a culture that is just. When a fair and impartial RCA uncovers facts that identify unsafe acts that are intentional or reckless, or reflect a pattern of unacceptable risk taking, safety-conscious individuals expect just mitigating action and accountability to take place.

C.    Analytical Tools and Processes. While it is not a requirement for an operator to implement any specific externally developed system, analytical tools or processes are available to assist in the analysis process. In view of the continuing evolution of this process, as of this writing, some examples of these tools are listed below.

1)    The Maintenance Error Decision Aid (MEDA) Tool. The Boeing Human Factors Engineering group in collaboration with the FAA, airlines, and the International Association of Machinists developed this tool for analyzing human performance issues related to maintenance errors and trends. Operators use MEDA to track events, investigate and prevent maintenance errors, and identify contributing factors, corrective actions, and prevention strategies. A software analysis package has been developed to work with this aid and facilitate analysis of systemic issues.
2)    The Managing Engineering Safety Health Tool. The University of Manchester developed this tool in collaboration with British Airways Engineering. The focus of this system is researching the workplace and organizational environment in aircraft maintenance to find the issues with the greatest potential to contribute to human factors problems. The system uses software, diagnostic, and sampling tools. Managing Engineering Safety Health conducts anonymous survey-like assessments among personnel at the work location. This is a more structured, data‑intensive approach toward determining and monitoring personnel attitudes toward the system than the interview process discussed earlier. The industry has far less practical experience with Managing Engineering Safety Health than with MEDA.
3)    The Human Factors Accident Classification System Maintenance Extension Tool. The U.S. Naval Safety Center, in collaboration with the FAA, developed this tool for use in the air carrier industry as well as naval aviation. This comprehensive system incorporates a number of analytical tools and has profiled maintenance errors and contributing conditions, permitting development of potential prevention measures. While the Human Factors Accident Classification System Maintenance Extension may be more sophisticated than many operators would need, it demonstrates principles and techniques of software-aided analysis that operators could apply to a CASS.

D.    Developing and Implementing a CAP.

NOTE:  For purposes of the CASS, CAP stands for Corrective Action Plan. Please do not confuse the acronym CAP for Comprehensive Assessment Plan used in SAS.

1)    Development. After the assessment of risk and completion of an RCA, a final decision can be made on a proposed CAP. As directed by written procedures, the CAP should address the root cause of the deficiency and provide a means of verifying that the corrective action fixed the problem.
a)    Responsibility/Authority.

1.    A CASS should designate the position or organization responsible for evaluating and approving proposed corrective actions as well as the parties responsible for implementing, monitoring, and ensuring that all affected parties receive notification, both within the company and externally, if necessary.

2.    The CASS director or other designated manager may appoint a corrective action team to design and propose a corrective action. The team, which typically represents a cross section of the departments involved in audits, operational data collection, analysis, and production, oversees the implementation of the corrective action.

3.    Technical and reliability control boards are most often used in conjunction with FAA-approved reliability programs; however, a similar concept applies to a CASS, even if no FAA-approved reliability program exists.

4.    The danger exists that one individual might be assigned to develop an entire CAP for which they have relatively little or no control (authority) to implement, and is a discrepancy in itself and must be fully addressed before any corrective action is assigned. Ultimately, the direct responsibility must always remain with the department required to address the discrepancy.

5.    For example, a corrective action may require a revision to a manual. The department (inspection) that is responsible for the CAP might require the assistance of another department (technical publications) to publish the revision. Consequently, the department that is able to publish the manual would now be responsible (secondary) for publishing the manual.

6.    It would be acceptable for auditors to help guide the responsible persons through the corrective action process. However, the auditors must remain independent from the corrective actions they may subsequently audit. The roles of auditors, analysts, managers, and committees should be clear when implementing the CAP.

7.    The appropriate authority must accompany the responsibility if the process is to be effective. CASS procedures should address:

    Who has received the authority to develop the CAP?

    Who will be responsible to develop the CAP?

    Who is responsible to approve the CAP?

    Does the air carrier maintain the appropriate role of auditors in developing a CAP?

b)    Duties.

1.    The CASS procedures should identify how this plan will receive approval and at what level of the company. The CAP should address all relevant issues, including a timetable for completion of the action with milestones, if appropriate. The appropriate technical department (and other departments, such as flight operations, if the corrective action goes beyond the inspection and maintenance organizations) should then implement the plan.

2.    While developing a corrective action, consideration should be made for determining the effectiveness of the corrective action. The corrective action should address what exactly would determine the effectiveness of the corrective action during the followup process. Being that the corrective action is developed to address the causal factor(s), the determination of the effectiveness should be based on what measure(s) would be used to validate the effectiveness.

3.    For example, if the deficiency involved the installation of incorrect fasteners and the root cause was determined to be the lack of training, the future followup would evaluate the effectiveness of any training that might have been implemented as it relates to the deficiency. Are the incorrect fasteners being installed after training has been accomplished?

c)    CASS procedures should specify that personnel will analyze a corrective action proposal carefully before its selection and implementation to ensure that corrective action is necessary and will actually fix the problem and not lead to unintended negative consequences.

1.    Air carrier procedures should instruct both the CASS and technical area personnel of the need to consider the impact of the proposed corrective action on other aspects of the operation. This includes other areas of the maintenance program, such as manuals. The corrective action may require coordination with other areas, such as flight operations, that might be affected.

2.    A CASS should provide written procedures for the development of a CAP that addresses:

    When a CAP will be developed, based on risk assessment;

    How to develop a corrective action proposal with the focus being on addressing the root cause, as necessary;

    Documentation of timelines for accomplishment of tasks within the CAP;

    How will the plan be approved;

    The recordkeeping and documentation requirements of the CAP;

    How risk assessment and/or systems analysis will be used to guard against unintended consequences; and

    How the effectiveness will be determined during the followup process (as necessary).

3.    In some cases, the operator may require data or assistance from a manufacturer to help correct a deficiency detected by the CASS. The operator should offer guidance in its CASS procedures, based on its particular experience, on how CASS and other personnel should address the need for assistance or information from manufacturers, and how to proceed in case of unsatisfactory or slow responses. This may include developing a standardized letter citing the need for this information or assistance to satisfy the requirements of  121.373, 135.431, or other pertinent regulations. It may also include working with the FAA principal inspector (PI) to find solutions.

2)    Implement the CAP. After development of a CAP, the responsible individual(s) must implement the plan. The importance of actually implementing the plan cannot be overstated. The completion of the corrective action must occur so that the discrepancy is addressed and the necessary followup surveillance can occur so that the effectiveness of the corrective action can be determined.
a)    Responsibility/Authority.

1.    The danger exists that one individual might be assigned to implement an entire CAP for which they have relatively little or no control (authority) to implement, and is a discrepancy in itself and must be fully addressed before any corrective action is assigned. Ultimately, the direct responsibility must always remain with the department that is required to address the discrepancy.

2.    For example, a corrective action may require a revision to a manual. The department (inspection) that is responsible for the CAP might require assistance from another department (technical publications) to publish the revision. Consequently, the department that is able to publish the manual would now be responsible (secondary) for publishing the manual.

3.    In the event that a CAP requires changes, procedures should assign authority to specific individuals to enable them to make changes to the plan.

4.    It would be acceptable for auditors to help guide the responsible persons through the corrective action process. However, the auditors must remain independent from the corrective actions that they may subsequently audit.

5.    The roles of auditors, analysts, managers, and committees should be clearly defined when implementing the CAP.

6.    The appropriate authority must accompany the responsibility if the process is to be effective. CASS procedures should address the following questions:

    Who will be responsible for implementing the CAP?

    Who is responsible for making changes to the CAP?

    Who will determine when the CAP has been completed?

    Does the air carrier maintain the appropriate role of auditor when implementing a CAP?

b)    Duties. Accomplishment of these duties is essential to the success of a CAP.

1.    CASS procedures should ensure that the corrective action is implemented. Procedures should specify how to implement the plan from the time of development to closure.

2.    The CAP should be specific as to what is expected to occur or to be accomplished. Clear timelines should document the completion of specific actions within the plan. For example, a CAP might have numerous actions that must occur in order to complete the plan, perhaps in a sequential order.

3.    It is essential that communication exist throughout the corrective action process. Procedures should include guidelines for how the technical area will communicate the status of the corrective action to the person responsible for monitoring implementation. Also, procedures should provide all parties involved with what will constitute a closure of individual action items within the plan and/or the plan in whole.

4.    The procedures for auditors, analysts, managers, and committees should be clearly established if the process is to be effective. CASS procedures should address the following questions:

    How will the implementation of a CAP be accomplished?

    How will changes to the documented plan and/or timeline be addressed?

    When will individual action items within the CAP and/or entire CAP be considered “closed”?

3)    Monitoring the CAP.
a)    Under the CASS, monitoring of the CAP should be a documented and systematic approach towards ensuring the implementation of the documented CAP. Without documented procedures for monitoring the CAP, the possibility exists that corrective action will not be implemented.
b)    Furthermore, if a corrective action was developed to mitigate or eliminate causal factors and was not implemented, the effectiveness of the corrective action could not be measured and the causal factors would still exist. CASS procedures should ensure that the corrective action was completed.
c)    Responsibility/Authority.

1.    An identifiable individual or entity (such as a CASS board) should be given the overall responsibility and authority for monitoring the status of the CAP. CASS auditors or analysts may have the direct responsibility of ensuring that the corrective action has been completely implemented in accordance with the established timetable or, if not, determining why the timetable has changed.

2.    Responsibilities should include determining if any changes in the corrective action are acceptable, as well as who will make the determination for plan closure.

3.    The roles of auditors, analysts, managers, and committees should be clear when monitoring the CAP. The appropriate authority must accompany the responsibility if the process is to be effective.

4.    CASS procedures should address:

    Who will monitor the status of the CAP?

    Who will approve changes to the CAP?

    Who will determine when the CAP has been completed?

d)    Duties. The following procedures are essential to the success of the corrective action:

1.    CASS procedures should ensure that the corrective action was completed. Therefore, procedures should specify how the plan will be monitored from the time of implementation to closure.

2.    The means for tracking the corrective action against a timeline will vary between operators, and methods are normally dependent on the policies of the operator. Monitoring the plan may be accomplished through the use of electronic media and/or paper media. Procedures should identify what method or methods will be used to monitor the implementation of the CAP.

3.    Effective communication must exist between the owner of a CAP and the individual who is monitoring the plan. Procedures should include clear guidelines for communicating the status of the corrective action from the affected technical area to the individual responsible for monitoring implementation. Also, procedures should provide all parties involved with a clear performance standard for closure of individual action items within the plan and/or the plan as a whole.

4.    Auditors, analysts, managers, and committees must have clearly established procedures if the monitoring process is to be effective. CASS procedures should address the following questions:

    How will the CAP be tracked (monitored) in accordance with the timeline?

    How will automation or computerized systems be used to monitor the implementation?

    How will changes to the documented plan and/or timeline be addressed?

    When will individual action items within the CAP and/or entire CAP be considered “closed?”

E.    Verify the CASS Functions Concerning the Followup Process.

1)    At the beginning of the corrective action process, a risk-based determination was made to mitigate or eliminate the associated risk. This determination led to the development and implementation of a CAP (risk control).
2)    When a CAP included an RCA, the primary goal of the CAP would have been to prevent recurrence of the discrepancy. To be effective, the plan would have specifically addressed the identified causal factor(s).
3)    Additional surveillance or data collection may be necessary to validate the effectiveness of the CAP. The followup surveillance plan is the means by which the effectiveness is validated and has two principles: verifying the effectiveness of the CAP and additional surveillance planning (auditing and/or data collection).
a)    Responsibility/Authority.

1.    An identifiable individual or entity (such as a quality organization) should be given the responsibility and authority for performing the followup of the CAP. CASS auditors or analysts may have the responsibility for making a determination of the effectiveness of the CAP and may sometimes determine that the corrective action was not effective and requires additional action(s).

2.    The roles of auditors, analysts, managers, and committees should be clearly defined in the CAP process as well as the additional surveillance planning process. The appropriate authority must always accompany the responsibility if the process is to be effective.

3.    CASS procedures should address the following questions:

    Who will validate that the CAP was effective?

    Who will approve changes to the surveillance planning or immediate actions?

b)    Duties.

1.    Performance measures to evaluate the effectiveness of the corrective action should be specific. The performance measures should have been established during the development of the CAP and should provide the information necessary to determine the level of action plan effectiveness.

2.    Verifying the CAP effectiveness may require a one-time audit or could require a series of frequent audits. CASS procedures should include how to determine the level of followup audits for verifying corrective action implementation. For example, based on the risk assessment or complexity of the corrective action, the designated CASS analyst or team may schedule special, less frequent, or more frequent audits.

3.    They may also change the data collection process or institute other means of verification. The FAA expects the operator to have a well-designed and logical process to design the followup actions.

4.    The accomplishment of the followup process should be verifiable. The operator should document the outcome of the process. This documentation should provide enough information to be able to conclude that the process has been accomplished.

5.    The procedures for auditors, analysts, managers, and committees should be clear if the process is to be effective. The methodology will probably vary from one operator to another, but the principles should be evident and verifiable in written procedures.

6.    CASS procedures should address:

    What measures will be used to evaluate the effectiveness of the corrective action, including identification of data to be collected, awareness of the possibility of unintended consequences, and events that should trigger a response;

    When and/or how often the followup will occur;

    How to use the automation or computerized systems to document the followup process; and

    How to determine changes to surveillance planning and/or data collection.

F.    Analyze Results. Follow SAS guidance for Module 5.

3-3902    TASK OUTCOMES. Follow SAS guidance for Modules 4 and 5.

3-3903    FUTURE ACTIVITIES. Follow SAS guidance.

RESERVED. Paragraphs 3-3904 through 3-3915.