Bill Hewlett and Dave Packard are graduates from Stanford University by the year of 1935 in the field of electrical engineering. In 1937, Bill and Dave both of them formalized their partnership in this field. They were much confused about the selection of the name for the company so for this matter, they did a coin toss. In the year of 1999, HP announces strategic realignment for the creation of an independent measurement company that is being composed of different tests and unique kind of the measurement components, chemical analysis and medical businesses, and a computing as well as to imagine about the company that includes all of HP’s computing, printing and imaging businesses (Hsu, Sarson, Schatzberger, & Leisenberger, 2016).

Agilent Technologies, it is basically a new name for the company of measurement, it was being announced in a historic launch of brand-identity event that took place in the San Jose, Calif., announced by Agilent President and Chief Executive Officer Ned Barnholt. (KeysightTech, 2019) In 2013 Agilent Technologies further announced that in future time it will definitely split into two of the different pure-play measurement companies. New name of the electronic measurement company is planned to be announced later in the year as Keysight Technologies.

During 2014 the separation process continues and on November 1 Keysight Technologies becomes a fully separate electronic measurement company (U.S. Government Printing Office, 1989). On November 3, 2014, Key sight lists on the New York Stock Exchange, under ticker symbol KEYS, completing the final phase of its separation from Agilent

1.2. Problem Background of Discover Test Failure pattern

Keysight offers a variety of products that includes hardware and software. One of the few instruments that Keysight are selling are multimeters, signal analyzes, atomic force microscope, power suppliers and hand held tools. Besides that, Keysight is serving the aerospace and defense, telecommunications, automotive & energy, and semiconductor industry. Keysight Technologies is the world’s leading electronic measurement company.

Most products in Keysight undergo 100% testing using the automated test. All tests are executed and deemed passed only if all tests are being passed accurately. This way the test process is straightforward, easy to administer and should ensure zero percent defect. The drawback of this test process is that it takes time to do complete 100% of the testing. The cost of testing will be evident for high volume production with a long test time. The test results of units tested in manufacturing contain lots of measurement, process and specification data. After testing, the result is stored in a raw data format using test executive software. The data analysis technique has the potential to uncover new insights from the test, looking at data from a different angle than the traditional SPC approach. Product can fail due to the number of different reasons that are not very obvious and clear or it can also be much difficult to surface them without the tedious statistical analysis on the test results.

1.3. Problem Statement of Discover Test Failure pattern

At any time, when so ever there is any kind of the problem related to the yield in the line of production, nothing seems to be more frustrating than this problem for the test engineers to know about the key for the unlocking of a specific problem is available, but that it is lost somewhere in a mountain of data files or databases. Data collection is usually carried out for compiling different results related to the testing for a different period of time, in different branches or it can also be under the different kind of tests conditions. With a shorter test development cycle, test engineers mostly use different kind of the data for the making of tests and then further perform them only through the basic filtering and simple analyzation. As a result, different kind of efforts that are being done for the shortening of test time often results in the efforts to shorten the test time often result in only incremental speed improvement. Engineers also use more time to troubleshoot the product quality issues due to data overload. To ramp up the volume production of a newly launched product, it is often easiest to allocate more engineers or set up more stations to test in parallel.

In manufacturing, the production operation faces the challenge of Optimizing Inefficient Process. Test process named “AC20GHz_Sequences.TrigMisc”. About 30% of the unit required at least 3 rounds of the run, each run consume 2 hours to complete the test. This is due to some failures that were found during the test and whenever a failure point is found, the test will automatically be stopped. This is followed by sending the DUT to the rework station and queue for rework. Once rework is done, the DUT will send back to the test station and resume the test from the last failure point. The process of sending for rework will be repeated when the next failure point is met.

This repeated activity in between the test station and rework station is part of the production waste as it creates lots of DUT movement to-and-back from rework station and DUT queuing time at rework station. According to LEAN definition these waste as categorized as the waste of motion and the waste of waiting.

1.4. Objectives of Project of Discover Test Failure pattern

Objectives of Discover Test Failure pattern

• Design & identify the Failure relationship in the test using Association Rules & Decision Tree

1.5. Benefit of Project of Discover Test Failure pattern

The test and calibration time will be reduced, when the failure the detected early, means the cycle time to produce a piece of new equipment will be reduced too. The annual test volume of tests in the year 2018 is around 2000 runs. Lower cycle time will directly reduce the production cost of a product and increase the net profit of a product. Table 1.1 illustrates the ROI of the Project.

Annual Unit	2222
Monthly Unit	186
Monthly Paid for Operator	882
1 round Test Time in Hour	2
Total Test Time removed in a month(by day)	48
Total Test Time removed in a month(by month)	2
Dollar saved in a month	1949
Annually Saved	USD 23,384

Table 1.1 ROI of the Project of Discover Test Failure pattern

Assume if this approach successfully deployed and reduce 1 round of retest, it can help to save up to USD 23,384. Also, capture and digitalize the pattern of the test failure information into a model using a data-driven approach. Transform from manual to machine learning model for test engineers to get the insights of test failure relationships in a test process (Khalaf, 2016).

1.6. Research Questions of Discover Test Failure pattern

The research questions are:

1. What is the pattern of the failed relationship in the test process?

2. What is the relationship between the test processes?

The research objectives are of Discover Test Failure pattern:

• To identify the pattern of the failed relationship in the test

• To discover the failure pattern by using the Machine Learning

2.0 Introduction of Discover Test Failure pattern

This chapter is to evaluate the available literature in the given domain which will cover the existing tools and analytical technique in the domain.

2.1. Literature Review of Discover Test Failure pattern

Optimal planning of an industrial manufacturing system, anticipating failures can be considered an insight (Khan, Schioler, Kulahci, & Peter, 2019) Productivity is one of three basic elements that manufacturers are seeking along with cost and quality.

Manufacturing tries to go beyond preventive maintenance to enable prescriptive maintenance systems. Downtime is critical to driving productivity and overall efficiency of industrial equipment and machinery. Predict failure analysis is to predict potential problems with the system or application. It extends availability by going beyond failure detection to predict the failure before occur (Aong & Lu, 2015).

There are several journals is being reviewed regarding the mining associate rules able to improved manufacturing productivity as it is important to know if the sequence of failure able to detected during usage or from historical data. (Kumar & Selvadoss, 2013). According to Unchalisa Taetragool proposed that design failure pattern analysis and solve problems in the domain of manufacturing quality improvement. The second study by Apte, Wiess, and Grout (1993) employed 5 methods to predict defects in hard drive manufacturing (Chen, Zheng, Lloyd, Jordan, & Brewer, 2004).

2.2. Data Science & Analytics Technique

2.2.1. Decision Tree of Discover Test Failure pattern

Decision Tree is a kind of supervised learning algorithm that widely being used for classifying the different kind of problems. It is a decision support tool, a tree-like graph of a model of decisions and the consequences, including the chance event outcomes, resource costs and so on. The tree-based method allows predictive models with high accuracy, stability and ease of interpretation. (Brid, 2019)

Application for decision tree has a natural “if.. then.. else”, this construction makes it fit easily into the programmatic structure. It also ideal for categories problem where the attributes or features are systematically evaluated to determine a final category (Williams & Simoff, 2006).

It has two types of a decision tree which is Categorical Variable Decision Tree and Continuous Variable Decision Tree. The continuous variable has the continuous target variable while the Categorical has the target variable such as “FAIL” or “PASS”. The figure below illustrates a problem to predict if the customer will pay the renewal premium insurance company (YES/NO).

The basic language associated with the Decision Tree is Root Node, Splitting, Decision Node, Leaf/Terminal Node, Pruning, Branch/Sub-Tree and Parent and Child Node. The advantage of the Decision Tree does not require normalization of data while the disadvantage of the decision tree requires a longer time to train the model (Rokach & Maimon, 2008).

There are several kinds of literature have mentioned data classification applications in manufacturing. Wei-Choi C. proposed a data mining solution for discovering the root cause of the low-yield situation.

2.2.2. Associate Rules of Discover Test Failure pattern

Associate Rule is a rule-based machine learning method to insight the interesting relationship among the variables. There is an if-then statement that helps to show the probability of relationships between data items. In associate rule mining, it helps analyze data for patterns or co-occurrence in the database. It evaluates frequent if-then associations.

There are two parts which are an antecedent (if) and consequent (then). An antecedent, an item found consequent within the data. (Rouse, n.d.). Elisa had discovered the data mining such as associate rules and decision trees are used to determine the cause of failures in the fabrication process (Criminisi, Shotton, & Konukoglu, 2012).

Elisa had discovered the data mining such as associate rules and decision trees are being used to know about the reason failures in the fabrication process. Furthermore, the use of association rules mining infrequent patterns captured from industrial processes can provide useful knowledge to explain industrial failures (Martínez-de-Pisón, Sanz, Martínez-de-Pisón, Jiménez, & Conti, 2012).

2.3. Literature Review on Analytical Tools

2.3.1 R Studio of Discover Test Failure pattern

R is a programming language and open-source software for statistical computing and graphics supported by R Foundation. R is widely used for statistical and data miners for developing the data analysis. In 1976, the R is created by Ross Ihaka and Robert Gentlemen at the University of Auckland. (R Programming, 2019).

R Studio makes R Programming to ease to use, it includes the code editor, debugging features and visualization tools as well. It supports the file format of Txt, Excel, SPSS, SAS, Stata. Also, R Studio able to integrate support of Git makes the user more convenient to access their workspace (Tan, Steinbach, & Kumar, 2016).

Figure 2 illustrates the R Studio screen that is a total of 4-panel workspace for 1. Edit and create the file containing R script 2. Key in the input of R commands 3. Traceback command history 4. Plot or graph visualization. (STHDA, 2019)

R is cross platform compatible, able to install on Windows, MAC OSX and Linux as well. It has thousands of documented extensions, the R package to work on.

Research Methodology of Discover Test Failure pattern

3.1 Introduction of Discover Test Failure pattern

In this project, discover failure pattern is designed and developed to forecast which test point has the highest failure rate. Discover Test Failure analysis provides insight to the user with the expected which test intended to be failed. This Chapter is about the selected methodology and activities plan of the project.

3.2 Research Framework

The above figure demonstrates the workflow of the project. The project is undergoing a production line for a specific product family and model. Data acquired for FY17’ Q4 to FY 18’ Q3 for 1-year historical data.

3.3. Activities Plan & Project Gantt chart

In this project, there are consists of four phases which are Introduction, Literature Review, Research Methodology, Result and Discussion and Conclusion where listed in Table 3.1. Besides, the Gantt chart is shown in Figure 3.2.

Phase	Task
Introduction & Literature Review	i. Background a. Domain/context ii. Problem Statement iii. Objective of project iv. Benefit of project v. Review on relevant literature on: a. Domain/context b. Data Science and Analytical techniques c. Data Science and Analytical tools
Research Methodology	i Activities plan and Gantt chart ii Data Science project lifecycle iii Data acquisition and data exploratory analysis
Results and Discussion	i Justification of selected DSA technique ii Justification of selected DSA analytical tool iii Challenges and Solutions iv Discussion and Validation on project outcomes
Conclusion	i Conclusion ii Future work

Data Collection

Data Source of Discover Test Failure patter

The Data Source in used for the current project consists of Multiple Station, Product Family, Model, Option, Results of Test Sequence, Test Point Name, Duration completion time. The data for 1 year between FY17 to FY18 has been collected for this project.

The data is required to query from the database in order to get the result file in xml content file and store locally in order to do the data pre-processing, figure 3.2 illustrates the data in the database. We does not have the raw data file from our local station, hence we need to access to the production database and query the data. We chose the FY17 to FY18 (1 Year) data to explore the relationship of the test sequence. After query the data, we use the c# program, to save the raw data file to local from the production database. The xml format consists all the information that required for this project such as Station ID, Result Outcome, Model, Option, Test Point Name, Test Point Results and etc. Each file size of the raw data around 2.2 megabytes, total 2222 rows of data, total file size is 5gigabytes (Grąbczewski, 2013).

Figure 3.2 Raw data file, xml format.

The raw data file is unable to be used directly for data mining task, we had created another c# program to extract the data that we need for the modeling. Result in Table form where each test is represented as rows as shown in Figure 3.3.

5.1. Conclusion of Discover Test Failure pattern

In a nutshell, this project has successfully delivery and met the objective and business requirement. This whole essay was about any kind of the yield problem that occurs while doing any project, engineers do feel much of the difficulty. So, in this essay different solutions and advantages have been discussed in detail for all the engineers so that they don’t face any of the problem in the future time. Along with this, their time don’t gets wasted at all.

Data collection is usually carried out for compiling different results related to the testing for a different period of time, in different branches or it can also be under the different kind of tests conditions. With a shorter test development cycle, test engineers mostly use different kind of the data for the making of tests and then further perform them only through the basic filtering and simple analyzation. As a result, different kind of efforts that are being done for the shortening of test time often results in the efforts to shorten the test time often result in only incremental speed improvement. Engineers also use more time to troubleshoot the product quality issues due to data overload. To ramp up the volume production of a newly launched product, it is often easiest to allocate more engineers or set up more stations to test in parallel.

5.2 Lesson Learnt of Discover Test Failure pattern

In this practicum, learn to focus on the data and the hidden patterns in the project. Business problem need to clearly state. I have learnt that. Main benefit of this whole project was that the test and calibration time also reduced, when the failure got detected early, means the cycle time to produce a piece of new equipment will be reduced too. The annual test volume of tests in the year 2018 is around 2000 runs. Lower cycle time can directly reduce the production cost of a product and increase the net profit of a product. I have also learnt this major thing in this Essay.

5.3 Future Enhancement of Discover Test Failure pattern

Every study have some of the basic implications that are being used for the future time. Different studies are being done by keeping one thing in the mind that how it will work in the future time. Same is the case here in this essay or project. Whatever one wants to say it. this project became much successful in giving some of the beneficial points for the engineer which they can use in the future period without having any disturbance or problem at all.

The model can also serve as the basis for more machine learning algorithm. Further exploration to the potential features from other data source might carried out such as in the prediction of future occurrence of test limit failure.

Main conclusion from practicum experience

• Describe any lesson learned from the experience

• Discuss any new opportunity that you would like to explore further

• Describe data science pipeline and theories used or not viewed as useful in the

Practicum

• Suggest how the practicum or your preparation for it might be improved

• Describe how the practicum experience integrated the student’s coursework in the

DSA program

References of Discover Test Failure pattern

Aong, Y.-y., & Lu, Y. (2015). Decision tree methods: applications for classification and prediction. Shanghai Arch Psychiatry., 27(02), 130-135.

Chen, M., Zheng, A. X., Lloyd, J., Jordan, M. I., & Brewer, E. (2004). Failure Diagnosis Using Decision Trees. Failure Diagnosis Using Decision Trees, 03(02), 01-10.

Criminisi, A., Shotton, J., & Konukoglu, E. (2012). Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning. Now Publishers.

Grąbczewski, K. (2013). Meta-Learning in Decision Tree Induction. Springer.

Hsu, C.-K., Sarson, P., Schatzberger, G., & Leisenberger, F. (2016). Variation and failure characterization through pattern classification of test data from multiple test stages. 2016 IEEE international test, 01(10), 01-10.

Khalaf, A. (2016). Applying Association Rules and Decision Tree Algorithms with Tumor Diagnosis Data. International Research Journal of Engineering and Technology (IRJET), 03(08), 27-30.

Rokach, L., & Maimon, O. Z. (2008). Data Mining with Decision Trees: Theory and Applications. World Scientific.

Tan, P.-N., Steinbach, M., & Kumar, V. (2016). Introduction to Data Mining. Pearson Education India.

U.S. Government Printing Office. (1989). Naval Research Reviews, Volume 41, Issue 3. U.S. Government Printing Office.

Williams, G. J., & Simoff, S. J. (2006). Data Mining: Theory, Methodology, Techniques, and Applications. Springer.