DEVELOPMENT OF A DECISION TREE BASED PREDICTIVE MODEL FOR CONDITION BASED FAULT DETECTION IN AUTOMOBILE ENGINES


BY


UMURIE STEPHEN NSIKAK

20CM027769


A FINAL-YEAR PROJECT SUBMITTED TO THE DEPARTMENT

OF MECHANICAL ENGINEERING, COLLEGE OF ENGINEERING,

COVENANT UNIVERSITY, OTA, OGUN STATE, NIGERIA


IN PARTIAL FULFILMENT

OF THE REQUIREMENTS FOR THE AWARD OF A BACHELOR OF

ENGINEERING DEGREE B.Eng. (Hons.) IN

MECHANICAL ENGINEERING


JUNE 2025




DECLARATION

I, Umurie Stephen Nsikak, hereby declare that the dissertation titled "Development of a Decision Tree Based Predictive Model for Condition Based Fault Detection in Automobile Engines" is an original work submitted for the degree of Bachelor of Engineering (B.Eng.) in the Department of Mechanical Engineering at Covenant University under the supervision of Dr. Ekpe C. Ikenna. I also confirm that all information obtained from the works of other authors has been properly acknowledged. Furthermore, I assert that this dissertation has not been presented by another individual in whole or in part at this or any other institution for a degree or diploma.




UMURIE STEPHEN NSIKAK                           ….…………………………………………… 20CM027769                                                                              Signature and Date  



CERTIFICATION

This is to certify that the final-year project report titled " Development of a Decision Tree Based Predictive Model for Condition Based Fault Detection in Automobile Engines ", prepared by Umurie Stephen Nsikak (20CM027769), has been reviewed and approved for submission in partial fulfillment of the requirements for the award of the Bachelor of Engineering (B.Eng.) degree in Mechanical Engineering at Covenant University, Ota, Ogun State, Nigeria.





Dr. Ekpe C. Ikenna                                                 ….…………………………………………… 

 Project Supervisor                                                                      Signature and Date







Professor Olayinka S. Ohunakin                           ….……………………………………………
  HOD Mechanical Engineering                                                
Signature and Date











DEDICATION

I want to dedicate this project to Almighty God in Heaven, my father Mr. Stephen U. Umurie, my mother Mrs. Theresa U. Umurie and my younger sister, Umurie O. Genevieve.

This research is dedicated to all those who have contributed to my academic and professional growth, and I am proud to carry this work forward as a testament to their support.



















ACKNOWLEDGEMENTS

I would like to take this opportunity to acknowledge everyone who has played a role in this journey. My deepest gratitude goes to my late grandmother, Paulina Umoh, who continues to intercede for me from heaven. I am eternally grateful for the unwavering support of my parents, Mr. Stephen Ufuoma and Mrs. Theresa Udeme Umurie, and my sister, Genevieve Umurie.

Special thanks to my uncle, Rev. Fr. Michael Umoh, my aunt and uncle, Anthony Umoh and Folashade Umoh, as well as their daughter Paula Umoh, for their encouragement. I also extend my warm regards to Rebecca Onwordi “Aunty Becky,” whose efforts were instrumental in helping me secure my placement at Total Energies.

I would like to acknowledge my manager, Mr. Francis Obiajulu, under whose mentorship I gained invaluable experience. I would also like to acknowledge Mr Francis Okonji. My thanks also go to my supervisor, Dr. Ikenna Ekpe, for his guidance and support throughout my research. I am also grateful to Engr. John Morounfoluwa, Engr. Enoch Obanor, Dr. Banjo Solomon and Dr Mfon Udo for their advice and encouragement in shaping my research. I would also like to acknowledge Murewa Newo, for her support and encouragement throughout this research











TABLE OF CONTENTS


DECLARATION i

CERTIFICATION ii

DEDICATION iii

ACKNOWLEDGEMENTS iv

LIST OF FIGURES viii

LIST OF TABLES x

LIST OF ABBREVATIONS xi

ABSTRACT xiii

CHAPTER ONE 1

1.1. Background to the Study 1

1.2. Introduction To Condition-Based Maintenance in Automobile Engineering 5

1.3. Fault Detection and Diagnosis 5

1.4. Problem Statement 7

1.5. Aim of the Study 8

1.6. Objectives of the Study 8

1.7. Significance of the Study 8

1.8. Scope of the Study 9

1.9. Limitations of the Study 9

CHAPTER TWO 10

2.1. Preamble 10

2.1.1. Working Principle of Internal Combustion Engines 11

2.1.2. Comparison Between 4-stroke Engines and 2-Stroke Engines 15

2.1.3. Classification of Engines 16

2.1.4 Components of Engine 19

2.1.5 Automotive Electrical Systems 24

2.1.6. Automotive Fuel System 25

2.1.7 Engine Performance 27

2.2 Historical Context and Evolution of Condition Based Maintenance 29

2.3. Modelling and Simulation in Automobile Maintenance 34

2.3.1. Predictive Models 35

2.3.2 Machine Learning Models 35

2.4. Studies and Development 37

2.5. Condition Based Maintenance Techniques 39

2.6. Overview of Diagnostic Systems in Modern Vehicles and Predictive Maintenance 39

2.7 Sensor Technology 42

CHAPTER THREE 44

3.1. Preamble 44

3.2.  Analysis 44

3.3. Data Sourcing 44

3.4.  Data Cleaning 45

3.5.  Data Modelling, Simulation, and Testing 47

3.5.1. MATLAB 47

3.5.2 Comparative Analysis with Other Machine Learning Models 48

3.5.3 Classification Decision Tree 49

3.5.4. Model Development 50

3.5.5 Model Evaluation 52

3.6. Development and Design of Predictive Model 53

CHAPTER FOUR 56

4.1. Analysis of Metric and Scores for Decision Tree 56

4.2. Decision Tree Performance 57

4.2.1. Analysis of Decision Tree and Breakdown 58

4.3. Analysis of the Predictive Model 59

4.3.1. Performance Of the Predictive Model 61

CHAPTER FIVE 64

5.1. Conclusion 64

5.2. Recommendation 65

REFERENCES 66






LIST OF FIGURES

Figure 1.1 Automobile Engine Error! Bookmark not defined.

Figure 1.2 Otto Engine Error! Bookmark not defined.

Figure 1.3  J.J Lenoir Engine Error! Bookmark not defined.

Figure 1.4  Electric Car Drivetrain Error! Bookmark not defined.

Figure 2.1 Working Principle of an External Combustion Engine Error! Bookmark not defined.

Figure 2.2 Intake Stroke Error! Bookmark not defined.

Figure 2.3 Compression Stroke Error! Bookmark not defined.

Figure 2.4 Power Stroke Error! Bookmark not defined.

Figure 2.5 Exhaust Stroke Error! Bookmark not defined.

Figure 2.6 4-stroke Engine Error! Bookmark not defined.

Figure 2.7 2-Stroke Engine Error! Bookmark not defined.

Figure 2.8 Different Engine Arrangements Error! Bookmark not defined.

Figure 2.9 Spark Ignition Engine Error! Bookmark not defined.

Figure 2.10 Compression Engine Error! Bookmark not defined.

Figure 2.11 Cylinder Block Error! Bookmark not defined.

Figure 2.12 Cylinder Head Error! Bookmark not defined.

Figure 2.13 Crankshaft Error! Bookmark not defined.

Figure 2.14 Camshaft Error! Bookmark not defined.

Figure 2.15 Valve Train Error! Bookmark not defined.

Figure 2.16 Piston Rings Error! Bookmark not defined.

Figure 2.17 Connecting Rods Error! Bookmark not defined.

Figure 2.18 Carburetor Error! Bookmark not defined.

Figure 2.19 The PdM flow Error! Bookmark not defined.

Figure 2.20 A typical Decision Tree Error! Bookmark not defined.

Figure 2.21 OBD codes interpretation Error! Bookmark not defined.

Figure 2.22 Automobile Sensors Error! Bookmark not defined.

Figure 3.1 Data Cleaning Operation Error! Bookmark not defined.

Figure 3.2 MATLAB Logo Error! Bookmark not defined.

Figure 3.3 Classes of Decision Tree Error! Bookmark not defined.

Figure 3.4 Training Results Error! Bookmark not defined.

Figure 4.1 Display of Trained Decision Tree Error! Bookmark not defined.

Figure 4.2 Code for Predictive Model Error! Bookmark not defined.

Figure 4.3 Output of Predictive Model Error! Bookmark not defined.
















LIST OF TABLES


Table 2.1 Comparison between Predictive Models………………………………………………33

Table 3.1 Comparative Analysis of Machine Learning Model Results…………………………. 48

Table 4.1 Average Class Metrics…………………………………………………………………56

Table 4.2 Pre Class Metrics……………………………………………………………………...56



















LIST OF ABBREVATIONS














ABSTRACT

Automobile engines are critical to vehicle operation, and timely fault detection is essential for maintaining performance and reducing downtime. This study presents the development of a decision tree-based predictive model for condition-based fault detection, aimed at improving engine diagnostics and maintenance efficiency. The research is centered on classifying engine conditions—either GOOD or BAD—through analysis of six key parameters: Engine RPM, Fuel Pressure, Lubricant Oil Pressure, Lubricant Oil Temperature, Coolant Temperature, and Coolant Pressure. The model was developed using a supervised learning approach in MATLAB, involving data preprocessing, training of classification decision trees, and cross-validation to ensure model robustness. The dataset comprised sensor readings from 19,534 automobile engines, sourced from secondary data, and served as the foundation for building the predictive framework. Threshold-based decision rules extracted from the trained decision tree were used to classify engine conditions and identify deviations responsible for BAD states. A root cause analysis feature was also integrated to detect specific parameter faults, enabling precise and targeted maintenance actions.The model demonstrated moderate classification accuracy, validating its potential for real-time application in engine condition monitoring. This research concludes that combining predictive maintenance techniques with interpretable decision tree models offers a practical solution for improving engine reliability, reducing reactive maintenance, and minimizing operational costs in the automotive industry.


Keywords: Predictive Maintenance, Decision Trees, Condition-Based Maintenance, Engines, Sensors, Fault Detection.







CHAPTER ONE

                                                  INTRODUCTION

Automobile engines as shown in the image below are the power of motor vehicles, allowing their motion by converting energy from fuel or electricity into mechanical power. This happens on a rotating output shaft. The term "engine" originates from the Latin word ingenium, meaning ingenuity, reflecting the innovation inherent in its development (Britannica, n.d.).



Figure 1.1 Automobile Engine 

   luxizeng. (n.d.). Automobile engine [Photograph]. iStock. https://www.istockphoto.com

The history of automotive engines dates to the 19th century, which coincided with the development in the automobile industry with the invention of the internal combustion engine (ICE) by innovators like Nikolaus Otto, who developed the four-stroke engine in 1876, as seen in Figure 1.2. (Wikipedia contributors, n.d.). This advancement laid the foundation for modern gasoline and diesel engines. Early automotive engines were bulky and inefficient, but technological advancements, particularly in the 20th century, improved their design and functionality. Historical records indicate that primitive internal combustion engines and self-propelled road vehicles can be traced back to the 1600s (Wikipedia contributors, n.d.). Many of these early designs were steam-powered prototypes that never evolved into functional, practical vehicles. At the time, the necessary growths in technology, infrastructure, materials, and fuels were lacking. Early heat engines, encompassing both internal and external combustion types, experimented with gunpowder and various solid, liquid, and gaseous fuels.

Henry Ford’s introduction of the moving assembly line in 1913 significantly improved engine production processes, making automobiles more affordable and accessible. In the 1950s and 1960s, General Motors incorporated automation into engine manufacturing with industrial robots, enhancing precision and efficiency (International Federation of Robotics, n.d.).

During the latter half of the 19th century, numerous designs of internal combustion engines were developed and tested. These early engines showed different levels of success and reliability, employing several mechanical systems and engine cycles.


Figure 1.2 Otto Engine

Henry Ford. (n.d.). Internal combustion engine [Photograph]. From the Collections of Henry Ford. https://www.thehenryford.org

Around 1860, J.J.E. Lenoir (1822–1900) developed the first fairly practical internal combustion engine shown in Figure 1.3. Over the next ten years, several hundred Lenoir engines were manufactured, producing up to 4.5 kW (6 hp) of power with a mechanical efficiency of roughly 5% (Heywood, 1988). Introduced in 1867, the Otto-Langen engine showed an enhanced efficiency of about 11%. Manufactured in the following decade were thousands of these atmospheric engines, which powered stroke used atmospheric pressure against a vacuum. Among the famous inventors of this time were Nicolaus A. Otto (1832–1891) and Eugen Langen (1833–1895) (Setright, 2003).


Figure 1.3 J.J. Lenoir Engine

 Google Images. (n.d.). [Etienne Lenoir]. Retrieved January 23, 2025, from https://images.app.goo.gl/FVMy175HzVGkQDGq8

During this time, engines based on the four-stroke cycle—the foundation of modern automobile engines—began to emerge as the optimal design. While many engineers contributed to its development, Otto received recognition for his prototype, completed in 1876.

The 1880s marked the debut of internal combustion engines in automobiles. This decade also saw the practical implementation and mass production of two-stroke cycle engines.

By 1892, Rudolf Diesel (1858–1913) had refined his compression ignition engine to closely resemble the modern diesel engine. This advancement followed years of experimentation, including trials with solid fuels in early prototypes. While early compression ignition engines were large, noisy, slow, and single cylinder, they were generally more efficient than spark ignition engines. It was not until the 1920s that multicylinder diesel engines became compact enough for use in automobiles and trucks.

By the 1980s, the Japanese automotive industry led advancements in engine design, focusing on fuel efficiency and reliability (Setright,2003). This trend continued into the 21st century, with China emerging as the largest producer of automotive engines by 2023, reflecting the global shift in manufacturing power (Statista, 2023).

The diversity of automotive engines has expanded dramatically to cater to various vehicle types and consumer demands. Traditional ICEs, powered by gasoline or diesel, have been complemented by hybrid engines that combine fuel-based and electric power, and fully electric engines that rely on batteries. Innovations like hydrogen fuel cells and advanced turbocharging technologies are also reshaping the landscape.

Environmental concerns and stringent regulations have driven the development of cleaner and more efficient engine technologies (EPA, 2021). The rise of electric vehicles (EVs) has challenged traditional ICE dominance, with major automotive firms investing heavily in EV engine development (IEA, 2023). Figure 1.4 shows the drivetrain of Electric Car.


Figure 1.4 Electric Car Drivetrain

Google Images. (n.d.). [Electric Car Drivetrain]. Retrieved January 23, 2025, from https://images.app.goo.gl/rKS27uC6tUhtyrmx5


In emerging markets like Nigeria, local manufacturers such as Innoson Vehicle Manufacturing (IVM) are assembling engines locally and contributing to economic development (NBS, 2023). These efforts underscore the potential for growth in domestic automotive sectors.

Automotive engines continue to be a critical area of innovation, balancing power, efficiency, and environmental sustainability. The transition towards electrification and alternative fuels marks a significant shift, shaping the future of the automotive industry globally.

Engines can be categorised by ignition type, engine cycle, valve placement, design, and cylinder configuration (Heywood, 1988). Types of ignitions are spark ignition (using a spark plug) and compression ignition (self-ignition from high compression). Engine cycles are classified as two-stroke (one revolution per cycle) or four-stroke (two revolutions per cycle). Valve location might be in the head (I Head), block (L Head), or a mix (F Head). Designs feature rotary engines with a rotor inside a stator and reciprocating engines with pistons moving back and forth. From single and in-line cylinders to V-shaped, opposed (flat), W engines, opposed piston engines, and radial configurations, found in cars, planes, and industrial uses, cylinder arrangements differ (Heywood, 1988).

When it comes to cars, predictive maintenance is the method whereby real-time data gleaned from sensors monitors vehicle conditions (Lee et al., 2020).  This kind of maintenance crosses systems and parts for vehicles (Mobley, 2002).  Highly important and with a very varied application is predictive maintenance.  It is quite crucial since it finds problems before they start, or faults not easily diagnosed by conventional maintenance techniques (Jardine et al., 2006).  Predictive maintenance lets faults be found ahead of time, unlike scheduled service-dependent predictive maintenance, so preventing any breakdown (Carvalho et al., 2019). Key technologies include vibration monitoring, which detects anomalies in rotating parts such as engines and gearboxes (Peng et al., 2010), thermography, used to monitor temperature in electrical systems and engines for overheating or friction issues (Bagavathiappan et al., 2013), and oil analysis (tribology), which examines lubricants to identify wear and contamination in engines and transmission (Hunt, 1993).

Ensuring their best performance and identifying problems have been vital for as long as cars have been built.  Historically, diagnosing problems in vehicles depended on human senses—visually looking for signs of wear or damage, listening for unusual engine or mechanical sounds, feeling for vibrations or heat from malfunctioning components, and even detecting odours from leaks or overheating systems (Mobley, 2002).  Although these techniques were helpful, their accuracy was poor, and they needed considerable human presence and expertise.

 By providing exact, real-time data on various vehicle attributes, sensors included into cars improved defect detection (Peng et al., 2010). Sensors track important variables including engine performance, fluid levels, temperature, and pressure, thereby providing more accurate assessments. These devices are prone to flaws, though, and they usually fail to exactly record issues (Carvalho et al., 2019). This issue became especially crucial as modern cars contained advanced automated systems whereby erroneous sensor data might lead to serious failures or unforeseen effects, (Lee et al., 2020). Including onboard computers in vehicles has revolutionised defect detection even more. By more completely identifying defects using data from several sensors, these systems provide real-time monitoring and diagnosis (Isermann, 2006). By spotting specific issues including failing sensors, advanced onboard diagnostic systems help to lower the likelihood of misdiagnosis. Moreover, the computational speed of these systems permits them to spot and resolve developing problems before they become more critical, hence enhancing vehicle reliability and safety (Hunt, 1993). 

 A program process for spotting and separating system operational defects is Fault Detection and Diagnosis (FDDs), which is mostly related to human practices, FDDs can be attained using a data driven approach, through data collecting techniques or based on existing knowledge which is mostly attributed to FDDs.  While Predictive Maintenance (PdM) has greatly enhanced equipment dependability and operational efficiency in vehicles systems, Diagnostic systems have helped to detect faults (Bagavathiappan et al., 2013).

Modern diagnostic systems consist of several parts, which are crucial for the maintenance of a car.  Comprising several tasks including fuel injection, ignition timing, and gearbox control, the Electronic Control Unit (ECU) is the main computer (Williams & Fletcher, 2018).  It stores the relevant DTC for diagnosis after processing data from many sensors and, upon fault detection, sets the Malfunction Indicator Lamp (MIL).  Real-time data from sensors and actuators all around the vehicle feeds the ECU, so optimising vehicle performance depending on this information.  The DLC provides a means of interface for linking diagnostic tools to the ECU of the vehicle, so facilitating the retrieval of DTCs, access to live data, and execution of several diagnostic tests (Peng et al., 2010).

  When a fault is found, DTCs—alphanumeric codes kept in the ECU—indicate problems like malfunctioning sensors or emissions-related concerns, so offering a basis for diagnosis (Mobley, 2002).  Staying on until the problem is fixed and the DTC is cleared, the MIL—often referred to as the "check engine"—warns the driver to fault that might compromise vehicle performance or emissions.  Continuous performance evaluation of vehicle components including oxygen sensors and catalytic converters by on-board monitoring causes a DTC and MIL illumination should a system run outside its intended range (Isermann, 2006).

 In vehicles, fault detection and diagnosis (FDD) is the identification and analysis of defects including leaks, wear, sensor errors, or actuator failures deviating from normal system behaviour (Jardine et al., 2006).  This procedure comprises of fault detection, component isolation, and occasionally fault scale quantification (Williams & Fletcher, 2018). Either model-based—dependent on predictive models and residual analysis—Venkatasubramanian et al., 2003; or model-free, using techniques like sensor redundancy, limit checking, or spectrum analysis.  Reliable fault detection and minimal false alarms depend on strong algorithms to help to mitigate noise, disturbances, and modelling errors (Lee et al., 2020).  Current FDD systems improve vehicle dependability, safety, and efficiency. 

 Effective FDDs in machines are built on these systems.  To raise vehicle condition, FDDs combine automotive diagnostics with preventative maintenance methods.  By fast identifying problems based on current data, this maximises motor vehicle operational efficiency.  Most importantly, it lowers the frequency of trials and errors accompanying the conventional approaches and helps to save maintenance expenses (Carvalho et al., 2019).

Conventional approaches of fault identification are rather erroneous. Most of the time they are grounded on residual knowledge, which can be outdated and prone to mistakes (Venkatasubramanian et al., 2003). Many times, this knowledge-based evaluation contradicts the maintenance needs of a real-world setting (Mobley, 2002). Dynamic automotive maintenance will enable it to change with the times of the engineering sector. This is thus why predictive maintenance systems support such flexibility (Jardine et al., 2006).

  Developing a dynamic model that solves discrepancies between virtual and real-world interactions, Cimino et al. (2019) offer a remedy. Pd.M. systems depend on sensor data, real-time monitoring, and historical maintenance records; hence, inconsistent, or missing data can compromise the predictive model efficacy (Carvalho et al., 2019).  Although data collecting devices for automotive vehicles vary depending on particular use, this sensor might be prone to malfunction, erroneous recording of some parameters which could be crucial (Lee et al., 2020).

The aim of the study is to develop a predictive algorithm using decision tree that analyses the specific engine parameters to detect the root causes of faults in automobile engines

The specific objectives of the study are to:

With an eye towards engine faults specifically, this study mostly aims to investigate the relationship between several engine parameters and engine condition.  The aim is to create a predictive model from Decision Trees able to exactly identify the exact source of engine faults. Analysing real-time data acquired from approximately 19,000 vehicles—which have been assessed for engine conditions—this model will be constructed classifying them as either "good” or “bad” depending on the performance. The research will identify a correlation between bad engine conditions and particular engine characteristics like engine RPM, lubricant oil pressure, fuel pressure, coolant pressure, lubricant oil temperature, and coolant temperature. Knowing how deviations in these criteria link with engine defects allows the Decision Tree model to provide a methodical technique of finding the root cause of failures. A predictive model will be developed in identifying potential engine faults and offering quick detection so that repairs may be executed.

The aim of the research is to build a predictive model based on decision trees using engine parameters including lubricant pressure, lubricant temperature, coolant pressure, coolant temperature, fuel pressure and Engine RPM to detect faults in engine by means of root cause of the problems in bad engine conditions.

The automotive system is big.  From temperature control, speed, geotagging, heat control, fuel and speed analysis, thousands of sensors are working in different ways.  Condition Based Maintenance covers large ground.  Every issue shows itself with a need to be resolved differently. This work intends to solve engine-based conditions and their effects on the engine.  This implies that not considered is sensor data acquired camshaft, voltage, mileage, tyre pressure, temperature, or another vehicle sensor. Such study will depart from the expected result. 

 With the dataset at hand, this study can investigate only specific engine parameters and how they impact the conditions of automotive engines, so offering knowledge on how best to handle the challenge and offer solutions













CHAPTER TWO

 LITERATURE REVIEW

2.1. Preamble

The engines are the powerhouses of any machine.   Engines are present in every mechanical device or machine, from the smallest to the largest ones.   Engines can be found in wristwatches, fans, grinders, lathe machines, milling machines, ships, motor cars and aircraft.   Entirely anything that requires mechanical energy to work.   Engines are present there.   These engines all operate on the same working principle but may vary in component or design based on the machines they are operating on.   These engines generate power through the combustion of fuel within the engine itself, converting chemical energy into mechanical work (Heywood, 1988).   ICEs are commonly classified into two main types: spark-ignition (SI) engines and compression-ignition (CI) engines (Stone, 1999). 

This research is focused on Internal Combustion Engines (ICE) in Automobiles.   The Automobile Engine is an Internal Combustion Engine because the fuel is burned in the presence of air inside the engine.   This thermodynamic process involves temperature and pressure changes within the engine to facilitate combustion (Pulkrabek, 2014).   Many ICES are reciprocating Engines because of the movement of the pistons which go back and forth.   There are External Combustion Engines (ECE), but these engines are mostly common among turbines (Cengel & Boles, 2019).   ICE is usually common in Automobiles, Jets, Rockets.   Figure 2.1 shows the working principle of an Internal Combustion Engine



Figure 2.1 Working Principle of an Internal Combustion Engine (Retrieved from https://images.app.goo.gl/fqbEeqLdxpqLmcdG9)




 2.1.1. Working Principle of Internal Combustion Engines

Two-stroke engines and four-stroke engines are two general classifications for internal combustion engines used in automobiles (Giakoumis, 2016).  Petrol, diesel or liquefied petroleum gas (LPG) is used by the engines in automobiles.  Efficiency and environmental issues have now driven modern vehicles with Compressed Natural Gas (CNG) systems to replace Petrol vehicles (Pulkrabek, 2014).

 Engines with reciprocating motions found in their cylinders. A stroke is the full motion the piston in the cylinder produces; this occurs in one direction. While -stroke engines are those engines that make one complete cycle in four strokes, two stroke engines are those engines that make one complete cycle in two strokes. (2011, Cengel & Boles).  Operating on a 2-stroke engine, the Vespa PX150   Four-stroke engines run everything from Toyota Corolla to Ford F-150 to Honda Civic.

 While the 2-stroke is common for smaller vehicles like motorcycles, scooters, lawn equipment, Heywood, 1988, a lot of automotive engines run on the 4-stroke cycle.  The four operations in the 4-stroke cycle—the Intake, Compression, Power and Exhaust Strokes—have four strokes as their total count.  (2011) Cengel & Boles. Internal combustion engines run on the idea of burning fuel into heat energy, which is subsequently changed into mechanical energy (Pulkrabek, 2004).


2.1.1.1. Intake Stroke

The intake stroke which is shown in Figure 2.2 is otherwise known as the suction stroke. During this cycle, air-fuel mixture enters the cylinder through the inlet valve. This implies that the inlet valve will be open, and the exhaust valve will be closed. The piston of the cylinder usually moves to the Bottom Dead Center (BDC) creating a partial vacuum for this cycle to occur. 


Figure 2.2 Intake Stroke (shutterstock, 2025)

Shutterstock. (n.d.). Engine four-stroke cycle infographic diagram [Illustration]. Retrieved January 4, 2025, from https://www.shutterstock.com/image-illustration/engine-four-stroke-cycle-infographic-diagram-707664301 

 

2.1.1.2. Compression Stroke

The compression stroke as seen in Figure. 2.3 is the second process in the cycle. During this process, both the inlet and exhaust valves are closed as the piston moves from bottom dead center (BDC) to top dead centre (TDC), compressing the air-fuel mixture that was previously drawn into the cylinder. This compression increases both the pressure and temperature of the mixture, facilitating its vaporization. As the piston approaches TDC, a spark is generated by the spark plug, igniting the air-fuel mixture.


Figure 2.3 Compression Stroke 

Gooseko. (n.d.). Combustion engine. Retrieved from https://images.app.goo.gl/ZFvMo1Dd9v5xftX96


 2.1.1.3. Power Stroke

In the power stroke, Figure 2.4, the inlet and exhaust valves remain closed, allowing the burning gases to expand and force the piston downward towards BDC. This downward movement of the piston also generates rotary motion in the crankshaft. Once the piston nears BDC, the exhaust valve opens to allow the spent gases to exit the cylinder.


                    Figure 2.4 Power Stoke


ScienceDirect Topics. (n.d.). Power stroke - an overview. Retrieved from https://images.app.goo.gl/BSqbQupyRjstuJBo8


2.1.1.4. Exhaust Stroke

During this cycle, the exhaust valve which is shown in Figure 2.5 is opened to allow the exit of by-products of the combustion from the cylinder. The piston returns up to the cylinder.

This process of combustion is common for Automobile Engines that run on Petrol. It is like the process of combustion for a 4-stroke cylinder engine that is powered by diesel engines. However, Diesel and gasoline engines differ primarily in their fuel injection methods, ignition processes, compression ratios, combustion characteristics, and emissions. Diesel engines inject fuel directly into the combustion chamber at high pressure and rely on compression ignition, resulting in higher compression ratios (14:1 to 25:1) and greater thermal efficiency. In contrast, gasoline engines typically mix fuel with air before ignition via a spark plug, operating at lower compression ratios (8:1 to 12:1). This leads to quicker combustion but can result in less thermal efficiency. Additionally, diesel engines produce more nitrogen oxides and particulates, while gasoline engines emit more carbon monoxide and hydrocarbons.




Fig 2.5 Exhaust Stroke

UKCar.com. (n.d.). Four-stroke engine features. Retrieved October 20, 2024, from http://www.ukcar.com/features/tech/Engine/fourstroke.htm 


2.1.2. Comparison Between 4-stroke Engines and 2-Stroke Engines

Two-stroke (Figure 2.7) and four-stroke (Figure 2.6) engines differ in features and uses that define them. While a two-stroke engine generates power with each single revolution, a four-stroke engine develops power for every two revolutions of the crankshaft.  This basic difference produces four-stroke engines with more components—such as valves, camshafts, and tappets—than the simpler design of two-stroke engines, which depend on ports without valves.  Whereas a 2-stroke engine completes the cycle in two strokes, providing greater power output but increased wear and emissions, a 4-stroke engine completes a power cycle in four strokes, offering greater fuel economy and lower emissions (Pulkrabek, 2004).

A four-stroke engine generates less power for the same size and number of revolutions than a two-stroke engine—which is intrinsically more powerful.  But since their turning moment is not as smooth and consistent as that of two-stroke engines, which can use a smaller flywheel due to their balanced power strokes, four-stroke engines demand a heavier flywheel.

Because a four-stroke engine's complexity results in more moving parts, friction is increased, and mechanical efficiency is lowered.  Two-stroke engines are more mechanically efficient since they have fewer moving components than others.  Moreover, four-stroke engines produce more output since their exhaust gases are totally burned and expelled.  Nevertheless, two-stroke engines suffer from some mixing of fresh charge with exhaust gases, so lowering their efficiency.

 Cooling systems also vary since two-stroke engines are often air-cooled and found in motorcycles, tricycles, and small boats while four-stroke engines are usually water-cooled and used in cars and commercial vehicles.  Since both engine kinds are meant to run on less lubricating oil, their respective efficiency in their intended uses increases.


Figure 2.6 4- Engine Stroke

 ScienceDirect. (n.d.). Four-stroke cycle - an overview. Retrieved from https://www.sciencedirect.com/topics/engineering/four-stroke-cycle 



Figure 2.7 2-Engine Stroke

The Engineering Choice. (n.d.). What is a two-stroke engine? Retrieved from  https://www.theengineeringchoice.com/what-is-two-stroke-engine/ 


2.1.3. Classification of Engines

Automobile engines are broadly classifies based on several characteristics such as number of cylinders, cylinder arrangement, valve train type, combustion chamber, type of fuel used, ignition type, cooling system, strokes per cycle, lubrication system, type of ignition, Valve location, engine design, air take process, method of fuel intake. Common classifications include inline, V-type, and radial engine configurations, each suited to specific vehicle applications (Giakoumis, 2016). Figure 2.8 shows the different arrangements of Engines.

According to number of cylinders, engines can have 3,4,5, up to 12 cylinders. This depends on the kind of vehicle and the purpose of the vehicle. In terms of cylinder arrangement engines can be inline, and this means that the cylinders arranged in a linear manner. V-engines have the cylinder facing one another inclined at an angle. Opposed cylinder engines and W engines, opposed piston engines and radial engines are also engine classification based on cylinder arrangements. 


Figure 2.8 Different Engine Arrangements

Savree. (n.d.). Straight and V-type engines [Image]. Savree. https://www.savree.com/en/encyclopedia/straight-and-vtype-engines 

Engines may use petrol, Diesel, Methanol, Natural Gas to operate. Some engines work using two or more fuels.

Engines work with either of these ignitions: Spark Ignition (SI) or Compression Ignition (CI). In spark-ignition engine (Figure 2.9), the heat required to case combustion is supplied by spark plug while in compression-ignition engine (Figure 2.10) atomised diesel sprayed into highly compressed air brings about combustion




Figure 2.9 Spark Ignition Engine

Mechanical Education. (n.d.). Spark ignition engine working. Retrieved from https://www.mechanicaleducation.com/spark-ignition-engine-working/ 

 




Figure 2.10 Compression Ignition Engine

ScienceDirect. (n.d.). Compression ignition - an overview. Retrieved from https://www.sciencedirect.com/topics/engineering/compression-ignition

According to the cooling system, engines may be air-cooled or liquid-cooled, while lubrication systems can be pressurized or splashed (Heywood, 1998. Based on the number of strokes per cycle, engines are categorized as two-stroke or four-stroke. Depending on the motion of the piston or similar parts, engines are classified as reciprocating, where the piston moves back and forth, or rotary (e.g., Wankel engines), where a triangular rotor performs the four-stroke cycle (Mazda, 2021).

 Valve arrangement forms another classification basis, including valve-in-head types, where valves are in the cylinder head, and configurations with valves in the engine block. Valve train types are either push rod (camshaft in the engine block) or overhead camshaft (camshaft on the cylinder head) (Setright, 2003).. Engines can also be grouped by combustion chamber type, such as wedge-shaped or hemispherical, or by usage, including automobile, truck, motorcycle, airplane, and stationary engines for agricultural pumps or generators.

2.1.4 Components of Engine 

The engine comprises several critical components that work together to ensure efficient operation. These components include the cylinder block, cylinder head, crankshaft, camshaft, and valve train (Heywood, 1988). each performing essential roles.

2.1.4.1.  Cylinder Block

The cylinder block (Figure 2.11) serves as the foundation for the engine, housing several attached components, including the cylinder head, crankshaft, oil and water pumps, transmission, and manifolds for exhaust and intake. It is typically manufactured through sand casting, with machined surfaces and holes for smooth operation (Kalpakjian et al., 2016). Cylinder blocks are commonly made of cast iron or aluminium alloys. Aluminium blocks often include liners or sleeves to protect cylinders from wear caused by piston rings, which are made of cast iron (Crouse et al., 2013). Liners may be either wet or dry, and some aluminium blocks are alloyed with silicon to improve durability (Reif, 2014).




Figure 2.11 Cylinder Block

ML Vehicle. (n.d.). Cylinder block material. Retrieved from https://www.ml-vehicle.com/info/cylinder-block-material-83538008.html 


2.1.4.2.  Cylinder Head

Attached to the top of the cylinder block, the cylinder head (Figure2.12) covers the upper cylinder portion and contains the combustion chambers, valves, rocker arms, and overhead camshaft(s) (Crouse et al., 2013). It is constructed from the same materials as the cylinder block and can feature wedge-shaped or hemispherical combustion chambers. Wedge heads position valves parallel to each other, while hemispherical heads place valves on opposite sides, enhancing combustion efficiency. A gasket is inserted between the cylinder block and head to prevent leakage under pressure and heat (Kalpakjian et al., 2016). Gaskets are made from materials like copper, asbestos, or steel.



               Figure 2.12 Cylinder Head

iStock. (n.d.). Cylinder head photos. Retrieved from https://www.istockphoto.com/photos/cylinder-head 






2.1.4.3. Crankshaft

The crankshaft (Figure 2.13) converts the reciprocating motion of the pistons into rotary motion, ensuring smooth engine operation (Heywood, 1988). It is supported by main and rod journals, with throws offset to accommodate connecting rods (Crouse et al., 2013). Counterweights balance the crankshaft, while drilled oil passages provide lubrication to bearings. Crankshafts are manufactured through casting, forging, or machining, with forged types used for high-performance applications (Kalpakjian et al., 2016). At the rear end, the crankshaft attaches to the flywheel, while the front end accommodates the timing chain, sprocket, or pulley.


                    Figure 2.13 Crankshaft

Internet Diesel. (n.d.). Detroit Diesel crankshaft. Retrieved from https://internetdiesel.com/products/detroit-diesel-crankshaft 

2.1.4.4. Camshaft

The camshaft (Figure 2.14) regulates the opening and closing of intake and exhaust valves, controlling valve overlap and ensuring synchronized operation with the crankshaft. It is driven by a timing chain, belt, or gears, with a gear ratio of 1:2 to maintain half the speed of the crankshaft (Crouse et al., 2013). 

Camshafts are made from harden able iron alloys or steel, with lobes shaped to actuate the valve train (Reif, 2014). In some older engines, the camshaft also drives the distributor and oil pump. Newer engines feature position sensors on the camshaft for precise timing of fuel injection and ignition.


     Figure 2.14 Camshaft

Autoprotoway. (n.d.). What is a camshaft? [Image]. Autoprotoway. https://autoprotoway.com/what-is-a-camshaft/ 


2.1.4.5. Valve Train

The valve train as seen in Figure 2.15 consists of the camshaft, cam lobes, rocker arms, valves, valve springs, and cam followers (Heywood, 1988). Cam lobes on the camshaft lift the rocker arms, converting rotary motion into linear motion to open and close valves (Crouse et al., 2013). The timing of this movement is critical to synchronize with the pistons. Rocker arms, typically made of steel for strength and leverage, ensure proper valve operation. In desmodromic valvetrains, additional cams actively close the valves, eliminating the need for valve springs.

Each of these components works together to perform the four-stroke cycle and ensure the engine's functionality, reliability, and efficiency (Pulkrabek, 2014).


Figure 2.15 Valve Train

Westcan Auto. (n.d.). Valve train & cylinder head parts [Image]. Westcan Auto. https://westcanauto.com/product/valve-train-cylinder-head-parts/ 




2.1.4.6. Piston and Piston Rings

The piston (Figure 2.16), housed within the cylinder, moves up and down during engine operation (Heywood, 1988). It is designed to endure high heat and pressure while transferring force to the connecting rod (Pulkrabek, 2014). Pistons are made of high-strength aluminium alloy and are slightly smaller than the cylinder to allow smooth movement (Crouse et al., 2013). To prevent fuel leakage, piston rings are fitted around the piston (Reif, 2014). Compression rings, positioned at the upper part of the piston, seal the combustion chamber and are typically made of cast iron, sometimes plated with chromium or molybdenum for durability (Kalpakjian & Schmid, 2016). Oil control rings, located lower on the piston, scrape excess oil off the cylinder walls 



                             Figure 2.16 Piston Rings

ScienceDirect. (n.d.). Piston ring [Image]. ScienceDirect. https://www.sciencedirect.com/topics/materials-science/piston-ring


2.1.4.7 Connecting Rod

The connecting rod (Figure 2.17) transfers the motion of the piston to the crankshaft (Heywood, 1988). It connects to the piston through a pin and to the crankshaft at the big end with the help of a rod cap and bearing (Pulkrabek, 2014). Made from cast or forged steel or iron, the connecting rod must be strong and lightweight to handle engine impulses (Crouse et al., 2013).


Figure 2.17 Connecting Rods

Perkins. (n.d.). Con rods [Image]. Perkins. https://www.perkins.com/en_GB/aftermarket/overhaul/overhaul-components/major-components/con-rods.html


2.1.5 Automotive Electrical Systems

Among the automotive electrical systems are the battery, starter motor, alternator, and several Electronic Control Units (ECU) in charge of engine performance monitoring (Bosch, 2007). Starting System, Charging System and Lighting System; the starting system uses components including the battery, starter motor, solenoid, ignition switch, and neutral safety switch to convert electrical energy from the battery into mechanical energy to start the engine (Tom Denton, 2017). By means of an AC/DC generator, drive belt, voltage regulator, and wiring harness (Bosch, 2007), the charging system recharges the battery and powers electrical accessories from mechanical energy into electrical energy.

Modern systems use computers and sensors for increased functionality; thus the lighting system comprises both interior and exterior lights including headlights, turn signals, and instrument backlights (Duffy, 2009). Combining mechanical and electronic innovations, advanced vehicle features—including cruise control, memory seats, electronic sunroofs, antitheft systems, automatic door locks, and keyless entry—improve safety, efficiency, and user convenience.  (Bosch, 2007)




2.1.6. Automotive Fuel System

The fuel system comprises the fuel tank, pump, filter, and injectors, which ensure proper delivery of fuel to the engine (Pulkrabek, 2004). Modern fuel systems use electronic fuel injection for precise control and efficiency (Giakoumis, 2016). These components, each with specific roles to ensure efficient operation:

 Fuel Tank

The fuel tank serves as the initial storage for gasoline or diesel. Located at either the rear or front of the vehicle (depending on the engine placement), it is made of pressed metal sheets or reinforced plastic. The tank features a filler neck for refuelling and houses an electrical fuel gauge sender to monitor fuel levels.

 Connecting Lines

Fuel is transported between the tank, pump, and engine via small-diameter metal tubing and synthetic rubber hoses. Rubber hoses are used in areas prone to vibration to prevent cracking or bending of metal lines.

Fuel Pump

The fuel pump transports fuel from the tank to the carburettor or fuel injection system, typically positioned above the tank. There are two types of fuel pumps: mechanical fuel pumps, which are driven by engine power, often via the camshaft, and electrical fuel pumps, located near or in the tank for heat protection. Electrical pumps can be either impeller-type or diaphragm-type.

Fuel Filter

The fuel filter, located between the tank and carburettor or fuel injection system, removes sediment, rust, and contaminants to ensure clean fuel delivery.

Air Filter

Mounted in the air intake system, the air filter removes dust, dirt, and moisture from incoming air before it mixes with fuel. This ensures clean air is delivered to the engine, improving combustion and engine longevity.


2.1.6.1 Fuel Metering and Atomization System

The fuel system mixes air and fuel to form an air-fuel mixture tailored to engine demands. A rich mixture (e.g., 11.5:1) is necessary during heavy acceleration or engine warm-up, while a lean mixture (e.g., 18:1) is used for idling or low-load cruising. The ideal air-fuel ratio for efficient combustion is 14.7:1 at sea level, but higher altitudes require leaner mixtures due to lower oxygen levels in the air.

2.1.6.2. Fuel Atomization and Vaporization

Atomization and vaporization are essential processes in fuel systems for efficient combustion (Heywood, 1988). Atomization reduces fuel into fine droplets, improving its mixing with air, while vaporization occurs in the intake manifold and cylinder, supported by airflow, low pressure, and heat (Pulkrabek, 2014).

Fuel delivery can be achieved through two main systems: carburettors and Electronic Fuel Injection (EFI). A carburettor as seen in Figure 2.18 is a vacuum-operated device that measures and atomizes fuel based on engine load, speed, and temperature. It operates on mechanical and hydraulic principles and may include modern components like temperature-control devices and throttle positioners to enhance performance and emission control (Crouse et al., 2013).


Figure 2.18 Carburetor

ATM Innovation. (n.d.). XRB race series E85 carburetor [Image]. ATM Innovation. https://www.atminnovation.com/products/xrb-race-series-e85-carburetor.html 



In contrast, EFI sprays fuel under pressure through injectors, with delivery managed by an onboard computer that processes data from various engine sensors. EFI systems provide greater efficiency, improved fuel economy, and reduced emissions compared to traditional carburettors, making them the preferred technology in modern vehicles.

2.1.6.3. Diesel Fuel Systems

The diesel fuel system is designed to handle the unique properties of diesel fuel, which contains heavier hydrocarbons and more heat energy than gasoline. Diesel fuel is classified into two types based on volatility: Number 1 (more volatile, used in temperate regions) and Number 2 (less volatile, used in moderate climates) (Pulkrabek, 2014). 

Cetane rating measures the fuel’s ignition properties under compression, with higher ratings ensuring easier ignition (Denton, 2017). In operation, only air is compressed in the cylinder, and diesel fuel is injected at high pressure (5,791–8,844 kPa) through nozzles by an injection pump (Crouse et al., 2013). Engine speed is controlled by regulating the amount and timing of fuel injection, either mechanically or electronically (Stone, 2012). This system ensures efficient combustion, optimal performance, fuel economy, and reduced emissions.


2.1.7 Engine Performance

The performance of an automobile engine is determined by how efficiently it can perform the assigned work, influenced by factors like brake horsepower (b.h.p.), brake mean effective pressure (b.m.e.p.), indicated horsepower (i.h.p.), mechanical efficiency, volumetric efficiency, thermal efficiency, specific fuel consumption, and more. The fuel system comprises the fuel tank, pump, filter, and injectors, which ensure proper delivery of fuel to the engine (Pulkrabek, 2004). Modern fuel systems use electronic fuel injection for precise control and efficiency (Giakoumis, 2016). Indicated Horsepower (i.h.p.) refers to the power developed at the piston face during the mechanical cycle, calculated using the formula:
, were,

 is the indicated mean effective pressure,

is stroke length, 

AA is the cylinder cross-sectional area, and

 nn is the number of working strokes per minute.

Brake Horsepower (b.h.p.) represents the actual power output at the crankshaft and can be determined using:
 

where nn is the number of working strokes and TT is engine torque.

The difference between i.h.p. and b.h.p. is due to frictional losses, which is known as friction horsepower (b.h.p.):

Engine Torque is related to the brake mean effective pressure (b.m.e.p.) by the equation:
 

where CC is a constant specific to each engine.

Thermal Efficiency measures the ratio of useful work to the heat supplied to the engine, either as brake thermal efficiency } or indicated thermal efficiency  The formulas for both are:
 and
 where  is the fuel supplied (kg/min) and  is the calorific value of the fuel.

Specific Fuel Consumption (S.F.C.) refers to the total fuel consumed per hour per horsepower developed. It can be indicated (i.s.f.c.) or brake (b.s.f.c.), with formulas:
                         and  


Mechanical Efficiency (ηm) is the ratio of power delivered (b.h.p.) to the total power developed within the engine (i.h.p.):

2.2 Historical Context and Evolution of Condition Based Maintenance

With the growing use of sensors, vehicle conditions and parameters have become quite important to grasp since Condition Based Maintenance (CBM) is a predictive maintenance approach.  One cannot stress too much predictive maintenance as a general concept.  Predictive maintenance aims to eradicate manual diagnostics, improve performance visibility, raise responsibility, and enhance for insufficiencies in manpower (Mobley, 2002).  Emerging as a solution to the inefficiencies of conventional maintenance methods, Condition-Based Maintenance (CBM)

 Over years, predictive maintenance (Pd.M.) has developed.  From the Industrial age, from the mid-18th century until the mid-20th century even up until some aspects of the Information Age, maintenance techniques were reactive.  This meant that only once a machine or tool breaks down repairs were undertaken (Tsang, 1995).  Preventive maintenance evolved as the machines grew more sophisticated.  This meant that ahead of time repairs were planned to avoid sudden breakdowns (Mobley, 2002).  The British Royal Air Force first instituted predictive maintenance in the 1940s.  Post-flight inspections quickly found and corrected problems.  Williams, 2006 reports a 60% increase in flight time improvement. Though it has limitations, PdM gained popularity in the 1990s and is now used in many disciplines including medicine, engineering, military, computing (Jardine et al., 2006). Predictive maintenance flow is shown in figure 2.19.


Figure 2.19 The Predictive Maintenance Flow

AutoPi. (n.d.). What is predictive maintenance? [Image]. AutoPi. https://www.autopi.io/blog/what -is-predictive-maintenance/ 


Driven by information technology and automation, the Third Industrial Revolution helped to enable Condition-Based Maintenance (CBM) and finally Predictive Maintenance 4.0 development.   To highly precisely predict failures and find flaws, PdM 4.0 combines data analytics, artificial intelligence (AI), and Internet of Things (IoT) technologies (Lee et al., 2014).  This method uses real-time sensor data collecting to support proactive maintenance choices.  Using big data, cloud computing, and sophisticated machine learning methods PdM offers exact, data-driven forecasts that lower maintenance costs and boost efficiency (Shao, 2017).

 Predictive maintenance employs several approaches and techniques; these daily improvements in machine learning, data and sensor technologies help to improve each other.  Among these methods are hybrid approaches, synthetic minority oversampling technique (SMOTE) for imbalanced data, Model-Based Diagnostics (MBD), condition-based maintenance (CBM), Remaining Useful Life (RUL) Prediction.

 Predictive maintenance CBM lowers vehicle failure rates by 55% to 70% when compared to the other methods.  The parameter analysis in CBM failure detection rate makes it more than 90% (Jardine et al., 2006).

 Knowing PdM originated in the 1940s, Condition Based Maintenance (CBM) would become well-known in the 1960s and 1970s.  Preventive maintenance's (PM) shortcomings were what drove PdM to develop.  One of the methods applied in condition-based maintenance was mechanical issue detection via vibration analysis.  Oil analysis examines lubricant quality; thermography uses infrared imaging to find temperature anomalies.  These criteria have changed with time mostly in response to technological developments.  Thanks to much improved sensor technology in cars, thorough testing and analysis of many vehicle components is now possible (Lee et al., 2014).

 Published in 2001, ISO 13306 defines Condition-Based Maintenance as a type of preventive maintenance based on performance and/or parameter monitoring, so including predictive maintenance as a subset of CBM.

 For many companies, maintenance expenses make up about 15% to 40% of their overall production expenses (Mobley, 2002).  "Predictive maintenance is highly cost-effective, saving roughly 8% to 12% over preventative maintenance, and up to 40% over reactive maintenance," the U.S. Department of Energy notes in its Operations & Maintenance Best Practices Guide: Release 3.0 (2010).

 Using sensor data collecting—real-time or periodic monitoring—CBM in vehicles evaluates the state of the vehicle to identify trends or faults.  Since problems are addressed much before they develop, it has been demonstrated over years that this is a reasonably affordable solution.  CBM detects raised faults by means of statistical, machine, and deep learning algorithms.  One can apply predictive maintenance for several purposes in general.  However, Condition Based Maintenance is a very relevant approach unique to fault detection.

 When weighed against other common forms of maintenance including preventative and reactive maintenance.  Often referred to as corrective maintenance, reactive maintenance is break-fix and far more expensive maintenance.  Although this is among the easiest maintenance techniques, nothing is done until something has happened.  Conversely, preventive maintenance is time-based since scheduling is done to enable it to happen.  It can also be based on usage; for instance, every 600,000 km oil changes on an automobile constitute use-based preventive maintenance.  Improvements in sensor technology and data analytics propelled CBM's development (Mobley, 2002).  Real-time monitoring and predictive powers given by CBM help to identify possible failures before they start (Lee et al., 2014).

 This table presents a comparison between this approach of predictive maintenance and other approaches of predictive maintenance in terms of technological aspects in vehicles:

Table 2.1 Comparison between Predictive Models

Technique

Strengths

Weaknesses

Model-Based Diagnostics (MBD)

Accurate fault prediction.

 High complexity in model development.


Real-time monitoring.

Dependence on human expertise.


Early detection of potential failures.

 High sensor/ECU costs.

Condition-Based Maintenance (CBM)

Efficient, real-time fault detection.

High initial setup cost.


Reduces unnecessary part replacements.

On-board data processing limitations.


Cost-effective long-term.

Hard to integrate into older vehicles.

Remaining Useful Life (RUL) Prediction

- Improves maintenance scheduling.

- Limited component scope.


- Tailored to specific components.

- High data requirements.


- Prevents unexpected failures.

- Dependence on model accuracy.

Hybrid Approaches

- Combines physical and ML models for improved accuracy.

- Complex integration and design.


- Versatile across components.

- High cost of development and deployment.

SMOTE for Imbalanced Data

- Addresses data imbalance.

- Synthetic data risks introducing noise.


- Improves model performance for rare faults.

- May not always suit complex systems.


2.3. Modelling and Simulation in Automobile Maintenance

Models are illustrations of elements or events from the real world.  Models try to convey a possible framework of physical causality.  They are developed to support research inside a logical and verifiable framework (Banks et al., 2010).  Conversely, simulation is the operation of a model used in analysis of system or phenomenon performance.  They help to study or forecast system behaviour.  There are several kinds of models: physical, predictive, analogues, heuristic, stochastic, deterministic.  By allowing the prediction of system behaviour under different conditions, modelling and simulation are quite important for car maintenance (Banks et al., 2010).

 Automobile development and maintenance extensively apply these techniques.  Buildings often feature physical models, which are replicated to scale to observe how the actual ones would look like. 

 Predictive use of predictive models is rather common (Zhang et al., 2020).  They call for independent variables that help to produce the result of a dependent variable. 

 An analogue, Though they use different entities to directly depict real-world systems, models are like famous ones.  In an analogue computer, for instance, electrical currents serve as quantities akin to materials or people flowing within a system.  Other examples include electrical signals in medical tests, which reflect the operation of muscles or organs, and gauges in tires that measure air pressure; the dial movement corresponds to pressure.

 Simulations often due to their complexity or impracticality, predictive models or real-world experiments, cannot adequately depict real-world systems; hence, models use sequences of random numbers.  These models replicate system behaviours and apply futuristic or intuitive guidelines to produce refined answers.  For instance, satellite communication systems originated in the heuristic model developed by Arthur C. Clarke.  Such models heavily rely on heuristics—mental short cuts taken to rapidly solve problems.  Using random variables, stochastic models including Monte Carlo simulations, regression models, and Markov chains project results under uncertainty.  They find extensive application in sectors including manufacturing, finance, agriculture, and meteorology.

 Conversely, deterministic models do not entail randomness.  Assuming certainty in all respects, the parameters of the model define every outcome by means of past states.  Among them are schedules, pricing policies, linear programming, and the Economic Order Quantity (EOQ) model applied in inventory control.

 These approaches have uses in industrial engineering, automotive engineering and maintenance, medicine, and artificial intelligence and machine learning since technological developments have matched current methods.

2.3.1. Predictive Models

They are abstract representations of real-world problems guided by logical implications (Kaplan, 2018). Many predictive models depend on relationships and variables to determine outcomes (Giordano et al., 2013). They can be descriptive, prescriptive, deterministic, stochastic, dynamic or static in modelling (Banks, 2001). Predictive predictive models also exist and are used to determine outcomes of systems and processes based on data and input variables (Montgomery et al., 2021). The categories of predictive predictive models which exist are Regression predictive models which predict continuous outcomes, classification models which are used to predict categorical outcomes (Hastie et al., 2009) The other models which exist are the time series model, stochastic models, dynamic and machine learning predictive models (Box et al., 201

2.3.2 Machine Learning Models

Predictive maintenance (PdM) in vehicles now heavily relies on machine learning (ML) (Zhang et al., 2020). Automobile fault detection makes use of supervised, semi-supervised, unsupervised learning approaches (Bose et al., 2021).  Based on deviations in real-time data, unsupervised learning systems group data points (Hastie et al., 2009). Finding unknown flaws (Zhang et al., 2020) is especially helped by this approach. Combining labelled and unlabelled data, semi-supervised learning lets models learn from the little labelled data (Chapelle et al., 2006). When precise labels for big datasets are lacking, these approaches especially help (Van et al., 2020). Among the several supervised learning methods applied for classification and regression problems are decision trees (Quinlan, 1996). Root node, branches, internal nodes, and leaf nodes make up it. For fault detection, it is quite effective; abnormalities are rapidly classified for study (Breiman, 2017). Additional often used techniques are logistic regression (Hastie et al., 2009), random forest, and K-Means. 


2.3.2.1.  Decision Trees

Using a step-by-step strategy to separate datasets based on feature values, decision trees—Figure 2.20—are structured models intended for tasks including classification and regression. Through data organisation into nodes and branches, they streamline difficult decisions (Breiman, 2017). Fundamental to their operation are ideas like entropy, which measures data randomness, and information gain, applied to choose the optimum feature for data splitting (Mitchell, 1997). For fault detection, where they logically sequence system parameters to find problems, these features make decision trees perfect (Hastie et al., 2009). Their capacity to manage noisy or incomplete data helps them to operate in real-time, in situations including risk assessment, equipment monitoring, or medical condition diagnosis (Loh, 2011). By reducing overfitting and enhancing generalisation to fresh data, techniques include pruning and cross-validation improve their performance (Breiman et al., 1984). Decision trees are consistent, interpretable, and generally used techniques for defect detection and diagnosis across many disciplines despite difficulties including managing big datasets and minimising selection bias (Quinlan, 1996; Hastie et al., 2009).
Decision trees vary in kind depending on its intended use and framework. Whereas regression trees manage continuous numerical predictions, classification trees forecast categorical results (Loh, 2011). Whereas multi-way trees offer many branches depending on different values, binary decision trees partition data into two branches each node (Breiman et al., 1984).
Using measures like the Gini index or mean squared error (Breiman, 2017), CART—Classification and Regression Trees—helps with both goals. Using information gain and gain ratio, ID3 and C4.5 trees rank attributes; C4.5 allows continuous data (Quinlan, 1996). Advanced variants include boosted trees—which repeatedly fix errors—and random forests, which mix several trees for resilience (Hastie et al., 2009). Other variants, including oblique decision trees and fuzzy decision trees, respectively, solve feature interactions and uncertainty, respectively, so making decision trees flexible for many uses (Loh, 2014).



Figure 2.20 A Typical Decision Tree

Jain, V. (n.d.). Understanding decision trees [Image]. Medium. https://medium.com/@jainvidip/understanding-decision-trees-1ba0ef5f6bb4 

2.4. Studies and Development

Predictive maintenance methods in the automotive sector have been investigated in several studies, which have underlined the success of decision trees and machine learning models (Kumar et al., 2018).  The expansion in automotive engineering means that models such physical, predictive or digital ones are often insufficient to reflect the nature of automobile behaviour (Annamalai, 2020).  Through cyber-physical systems, machines are growing more linked; predictive models must be able to process enormous amounts of data in real-time (Jain, 2019).  Many times, existing models find it difficult to combine data across several domains—mechanical, electrical, thermodynamic—resulting in erroneous predictions.  Since they consider the unique qualities of an automobile system, models that can accomplish this usually generate a better degree of accuracy in enhancing maintenance practices (Mykich et al., 2024).

 From reactive and corrective maintenance to proactive and predictive techniques, the evolution of maintenance reveals the increasing demand for more sophisticated solutions.  Preventive maintenance cycles remain ineffective for many machine components even if maintenance techniques including Total Productive Maintenance (TPM) and Reliability-Centred Maintenance (RCM) have enhanced equipment effectiveness and safety.  About 92% of machine components do not profit from cyclical maintenance, which results in unneeded expenses (Kurkin et al., 2011).  Depending on vehicle parameters or conditions, predictive maintenance methods such as CBM have shown that maintenance operations are optimised and uncertainty is lowered (Bunkar et al., 2021). 

 Over the years, works in predictive maintenance have been regularly conducted.  With the intention of testing several models to different hyperparameters documenting which one provided the most accurate results for fault detection, Khrystyna Mykich, Iryna Zavushchak and Andrii Savka (2024) conducted thorough research on predictive maintenance for automotive vehicle engines in military logistics.  Jain (2019) has done similar predictive maintenance on the diagnosis of vehicle health using machine learning approaches.  Annamalai (2020) investigated Cloud-based predictive maintenance and machine monitoring for intelligent manufacturing for automotive industry analysing its effects and impacts in order to go farther in the research.  Data-driven methods have shown promise for improving fault detection accuracy (Bunkar et al., 2021). Several other case studies demonstrate the application of predictive maintenance in the automotive industry:

1. Using a mix of one-class and two-classifiers, Theissler (2021) put out a semi-supervised method. Whereas two-class classifiers identify known faults using both normal and defective data, one-class classifiers find undiscovered problems from normal operation data. Tested on several cars, this method—enhanced by Support Vector Data Description (SVDD)—for hyperparameter tweaking.

2. Using OBD interface data, Shafi et al., (2021) created a vehicle fleet fault detecting system.  From data gathered from 70 automobiles, they used decision trees, SVM, k-NN, and random forests to find errors across subsystems.  The approach made it possible to find flaws in one car and use that knowledge to notify others in like circumstances.

3. Structured denoising autoencoders, proposed by Tagawa et al., (2021) enable partial knowledge about variables and errors.  Though genuine defects were not used in their evaluation, their approach exceeded one-class SVMs in simulated driving conditions.

4. Routray et al., (2021) implemented a data-driven fault detection framework using unsupervised methods like independent component analysis (ICA) and principal component analysis (PCA) for anomaly detection in automotive sensors.

2.5. Condition Based Maintenance Techniques

Hundreds of sensors make up the contemporary car.  These sensors track vehicle parameters, so enabling timely detection of faults (Ribbens, 2017).  Along with pressure sensors, speed sensors, position sensors, oxygen sensors, and proximity sensors, the car features temperature sensors to track the temperature of several components (Singh et al., 2019).  Obtaining parameters used in condition-based maintenance (CBM) systems falls to these sensors.

Among the several CBM methods are vibration analysis, which tracks automotive vibration patterns (Jardine et al., 2006).  Thermography uses infrared thermal imaging to find aberrant heat patterns generally resulting from electric faults (Mobley, 2002).  Another approach, oil analysis looks at lubricants to evaluate engine condition (Heng et al., 2009).  Engine temperature and lubricant temperature are both tracked in temperature monitoring analyses (Albarbar et al., 2010).  Other techniques consist in pressure and flow monitoring, corrosion monitoring, Motor Current Signature Analysis (MCSA), Electrical Signature Analysis (ESA), and acoustic emissions (Randall, 2011).  One can examine vehicle conditions using these several methods.  The parameters are evaluated depending on the data gathered by the sensors and the dependencies inside vehicle systems (Lee et al., 2014).

2.6. Overview of Diagnostic Systems in Modern Vehicles and Predictive Maintenance

Diagnostic systems are applied for defect detection in vehicle systems.  From simple manual methods to sophisticated electronic systems monitoring and reporting on vehicle performance, diagnostic systems in contemporary automobiles have developed dramatically (Bosch, 2020).

 Introduced in the early 1980s, On-Board Diagnostics I (OBD I) were the first generation of electronic monitoring in automobiles (Zhang et al., 2017).  But OBD I systems ran against difficulties with interface and communication complexity.  Different car companies used proprietary diagnostic systems, which resulted in inconsistent detection between several models and makes (Taylor, 2019).  OBD II was developed in 1996 and provides a consistent diagnostic system for all automobiles, therefore helping to solve these problems (Singh et al., 2019).  Standard Diagnostic Trouble Codes (DTCs), a universal Data Link Connector (DLC), and improved vehicle emissions monitoring (Walker, 2016) comprised this system.

 Predictive maintenance (PdM) has developed alongside these diagnostic tools.  Vehicle maintenance has been much improved by the combination of OBD systems with predictive maintenance techniques.  Real-time data collecting from sensors included in the vehicle drives both systems (Bishop, 2021).  Analysing this information helps one to see trends and project possible failures, hence lowering breakdown risk and maximizing repair plans (Garg et al., 2020).

 Notwithstanding their benefits, these sophisticated diagnostic systems provide difficulties especially in terms of the technological knowledge needed for their operation (Kumar et al., 2018).

 Usually under the steering wheel, the OBD-II connector allows the Driver Information System (DIS) to link to a vehicle.  This system gathers, analyses, and shows OBD-II data on an LCD screen together with a data logging capability storing data on a multimedia card (MMC) written in.csv (Walker, 2016).  Software like Microsoft Excel or Notepad can then help one examine the logged data.  Though its displays vary depending on the vehicle and device, the DIS usually shows parameters including engine speed (RPM), vehicle speed (km/h), coolant temperature, and battery voltage (Bosch, 2020).

 The DTC system notes car diagnostics in real time.  Monitoring the Indicator Lamp, a scanning system notes confirmed, and pending fault codes and features a DTC reset capability (Garg & Sharma, 2020).  Two basic parts make up the DTC hardware:

 vehicle data retrieval and gearbox OBD-II interface unit Real-Time Clock (RTC) unit, which timestamps diagnostic logs for later analysis (Taylor, 2019). Diagnostic Trouble Codes (DTCs) are alphanumeric codes stored in the Electronic Control Unit (ECU) whenever a fault is detected (Singh et al., 2019). Each DTC code corresponds to a specific system issue, following a structured format:


P (Powertrain)

B (Body)

C (Chassis)

U (Network Communications)

The first digit of the code identifies whether the issue is generic (SAE-standardized, code "0") or manufacturer-specific (code "1"). The next three digits provide additional information about the fault (Zhang et al., 2017):


1 – Fuel and air metering

2 – Fuel and air metering (injection circuit malfunctions only)

3 – Ignition system

4 – Auxiliary emission controls

5 – Vehicle speed control and idle control system

6 – Computer and auxiliary outputs

7 & 8 – Transmission faults

Figure 2.21 shows the Onboard Diagnostic Code Description

.

Figure 2.21 OBD Codes Interpretation

CalAmp. (n.d.). DTC codes [Image]. CalAmp. https://www.calamp.com/blog/dtc-codes/ 


2.7 Sensor Technology

Almost every mechanical and electrical device operated has sensors (Patel & Singh, 2020).  They are essential for cars since they gather information that lets engineers keep an eye on vehicle conditions (Johnson, 2019). Many sensors find use in vehicles, ranging from mass flow sensors to tire-pressure sensors to Manifold Absolute Pressure (MAP) sensors to engine oil pressure sensors to coolant temperature sensors (Kumar et al., 2021).  Usually, they are found down the exhaust pipe, close to the crankshaft or flywheel, attached to the cylinder block, near the brake system, in the dashboard, or in the steering column or vehicle bumpers, these sensors are usually near the air intake manifold (between the air filter and the throttle body).  Since sensor types and locations vary between car models, this is not a complete list (Lee et al., 2022).

These sensors run automatically under a telematics process.  In sensors, telemetry is the process by which sensor data is gathered and stored for analysis (Anderson et al., 2018).  Sensor devices must gather the data, and centralised systems must exist to save it if this is to happen (Gonzalez et al., 2020).  The On-Board Diagnostics (OBD) tool, which gathers data on a vehicle's health state, is one prominent instance of automotive telematics (Martinez & Jones, 2017).  Predictive maintenance programs and vehicle health monitoring are built upon telematics.  The information gathered by sensory devices lets engineers examine performance criteria and draw conclusions to enhance vehicle systems (Chen et al., 2021). Figure 2.22 shows several sensors.


 

Figure 2.22 Automobile Sensors

Monolithic Power Systems. (n.d.). Automotive applications [Image]. Monolithic Power Systems. https://www.monolithicpower.com/en/learning/mpscholar/sensors/real-world-applications/automotive-applications 









CHAPTER THREE

   MATERIALS AND METHOD

3.1. Preamble

This chapter describes the techniques applied to examine sensor data to forecast the condition of 19,534 motor vehicles.  Based on supervised classification techniques—more especially, using Decision Trees since they can reasonably interpret and predict engine conditions—the study is Advanced data engineering features of MATLAB, which support strong data preprocessing, modelling, and simulation, led to its choice as the analytical tool.  In this research, the workflow consists in data sourcing, cleaning, modelling, simulation, and testing.  Any method can be applied form regression, decision trees, or neural network while creating a predictive model.

3.2.  Analysis

Three main stages comprised the method of analysis.  Data from a previously examined dataset with sensory readings connected to vehicle engine health was gathered in the first stage and divided the engines into two groups: good and bad engine health.  The data source was readily available and required no challenging methods to get.  The second stage addressed problems including outliers and inconsistencies to preprocess the data so guaranteeing its quality.  Training the model to find the most efficient arrangement for accurate analysis and dependable results came last.  There was test and training data split from the collection.  While the accuracy for the algorithm was validated using the remaining 30% set aside for testing, the classification decision tree was trained using the Training data including 70% of the original data set.

3.3. Data Sourcing

This study used a secondary dataset from Kaggle on Automotive data health set with sensor readings from 19,534 vehicle engines.  Two main forms of data extensively applied in PdM are historical maintenance data and sensor data (Cheng, 2020). Sensual and monitoring tools were used to compile these data (Mykich, et al., 2024).  On the vehicles, these devices were set to monitor their condition.  Data was obtained from several areas of the car engine; thus, this is a telemetric approach.   These readings covered criteria vital for engine monitoring including Engine RPM, Lubricant Oil Pressure, Fuel Pressure, Coolant Pressure, Lubricant Oil Temperature, and Coolant Temperature.  Every engine in the dataset fell into either one of two categories: either good engine health (coded as 1) or poor engine health (coded as 0).  Standard fault criteria guided the defined thresholds used in these categories.  The main objective of the gathered data was to enable a supervised learning model to find patterns in the sensor readings and hence diagnose faults. 

3.4.  Data Cleaning

Data cleansing as depicted in Figure 3. One of the preprocessing chores done before any algorithm is developed is operations. Just like feature engineering or replacement or augmentation for missing data are part of it as well as removing missing values.  Cleaning of the dataset guarantees its integrity and dependability for modelling and analysis.  Processing missing values came first.  Any missing values in the dataset were eliminated rather than filled in.  This method was selected to preserve the integrity of the relationships among variables since imputing missing values—that is, mean or median values—may cause bias or distortion of the data.  Given their rather low frequency, it was possible in this study to exclude missing values. Then, outlier removal was used to exclude data points significantly outside the norm. In this sense, uncommon events, measurement errors, or malfunctioning sensor readings could produce outliers. Their elimination was vital since outliers can skew statistical estimates such as the mean and standard deviation, generating models unable to effectively extend to new data.
Lastly, the dataset underwent tests using moving averages among other smoothing techniques and spline fitting. These methods were not useless, either, since the dataset was already clean and exhibited no considerable noise that might compromise study.


 


Figure 3.1 Data Cleaning Operation









3.5.  Data Modelling, Simulation, and Testing

The Several phases of modelling, simulation, and testing of the dataset guaranteed the growth of a strong and dependable model for engine defects. Beginning with MATLAB, this part explores the tools and approaches applied, then the decision tree methodology, model creation process, and performance assessment of the model.


 3.5.1. MATLAB

MATLAB (Figure 3.2) was selected for this research since of its extensive capabilities in data analysis, visualisation, and machine learning (Mathworks, 2023).  Its strong Classification Learner performance gave a complete stage for fast testing of algorithms.  Multiple machine learning techniques are supported by MATLAB including decision trees, discriminant analysis, support vector machines, logistic regression, closest neighbours, naive Bayes, kernel approximation, ensembles, and neural networks.

 While offering visual feedback on model performance, the Classification Learner enables for simple integration of data pretreatment techniques including dimensionality reduction and outlier removal.  MATLAB's capacity to manage vast amounts of data and run computationally demanding operations guaranteed a flawless and quick construction of the model.  Its natural support for cross-validation and hyperparameter adjustment improved the results' dependability even further.


Figure 3.2 MATLAB Logo

MathWorks. (n.d.). MATLAB logo [Digital image]. Retrieved January 23, 2025, from https://www.mathworks.com/company/aboutus.html 


3.5.2 Comparative Analysis with Other Machine Learning Models

There are so many machine learning models that can be used for Predictive Maintenance. On the MATLAB these models can be found in the Classification Learner, and they include, Support Vector Machines (SVMs), Random Forests, Neural Networks, K-Nearest Neighbours (KNNs), Ensembles and Decision Trees. SVMs are used for binary classification tasks, but most times they do not do well with big datasets. Neural Networks are used for complex datasets; however, their strength lies in the training of this dataset. Before proceeding to select my model, I trained all the available ML models, and Table 2 shows the results of the training

Table 3.1 Comparative Analysis of Machine Learning Model Results

Machine Learning Model

Result

Linear SVM

65.5%

Quadratic SVM

Not Specified

Cubic SVM

Not Specified

Fine Gaussian SVM

64.1%

Medium Gaussian SVM

66.1%

Course Gaussian SVM

65.5%

Fine KNN

58.7%

Medium KNN

62.9%

Coarse KNN

66%

Cosine KNN

63.6%

Cubic KNN

62.9%

Weighted KNN

62.6%

Boosted Trees

66.5%

Bagged Trees

63.2%

Subspace Discriminant

65%

Subspace KNN

61.9%

Medium Decsion Tree

65.7%

Fine Decision Tree

65.4%

Coarse Decision Tree

65.5%

RUSBoosted Trees

62.3%


The decision tree was chosen because of its advantages despite underperforming when compared to Boosted Trees (66.5%) and Medium Gaussian SVM (66.1%). 

The Decision Tree was chouse because if interpretability, ability to categorize and handle numerical data, and quick deployment. This is because it required less computational power, and the dataset could be easily trained with the model.


3.5.3 Classification Decision Tree

Given its interpretability and efficiency in supervised learning applications, the decision tree technique was selected as the main model for this work.  Starting with a root node and branching out to leaf nodes, decision trees map out alternative decisions and their consequences, hence producing a flowchart-like structure.  Every internal node makes a choice depending on a certain feature; every leaf node corresponds to a class label, say "good" or "bad" engine health.  As the literature study discussed, different kinds of decision trees were addressed together with their uses.  Given the dataset—binary classifications of engine health—one for healthy engines and zero for bad engines—a classification tree seemed most suitable for this investigation. The Gini diversity index was selected as the split criterion, as it minimizes impurity at each split and improves the model's ability to separate good and bad engine conditions effectively. The Gini index measures the probability of a randomly chosen element being misclassified, making it a reliable metric for this task. The Classification tree trees predict nominal responses (e.g., "true" or "false"). To give the outcome of my prediction each step involved checking the value of either of my predictor which was true or false. The prediction begins at the highest node. A sequence of decisions is made from node to node until the root node. Figure 3.3 shows the different classes of decision tress on MATLAB


Figure 3.3 Classes of Decision Tree


3.5.4. Model Development

The process of model development began with uploading the dataset, which consisted of predictors such as Engine RPM, Fuel Pressure, Lubricant Oil Pressure, Coolant Pressure, Lubricant Oil Temperature, and Coolant Temperature. These were the data obtained from the sensor readings.

The Classification Learner is in the Machine Learning and Deep Learning group of MATLAB.

The dataset which was in comma separated values (csv) was uploaded to the classification learner workspace. For the data type used for the research, the data type selected was response and predictor variable. The predictor variables are the six engine parameters Engine RPM, Fuel Pressure, Lubricant Oil Pressure, Coolant Pressure, Lubricant Oil Temperature, and Coolant Temperature, whilst the response variables are the engine condition predictors which are 0 and 1. 

3.5.4.1. Validation Scheme

Cross-validation divides the dataset into folds, training a model on  folds and validating it on the remaining fold. This process repeats for each fold, averaging the validation errors to estimate predictive accuracy. While computationally intensive, it maximizes data usage and is ideal for small datasets. Holdout validation splits a dataset into training and validation sets, assessing performance on the validation set. It is simpler but less efficient, making it suitable only for large datasets. 

Resubstituting validation uses all data for training and testing, offering no protection against overfitting. It tends to overestimate model accuracy, as the same data is used for both training and evaluation. For this research most suitable validation scheme was the holdout validation, however the Cross-Validation was used because it would give a better performance result

3.5.4.2. Testing Data

The dataset was split into two portions: 70% was used for training the model, while the remaining 30% was reserved for testing. This split ensured that the model was evaluated on unseen data, which is critical for assessing its generalizability.

3.5.4.3. Model Simulation

The Classification Learner model was simulated. The model created a visual analysis of the simulated inform of scatter plots, Confusion Matrix, ROC Curve, Precision Recall Curve. This was used to investigate the variables which were best for predict. 

The Classification Learner in MATLAB was used to build and train three types of decision trees. There is the fine tree, a highly detailed tree with many leaf nodes (maximum splits: 100). It offers exact divisions between classes, which helps one to spot minute trends.  A somewhat complicated tree (maximum splits: 20), medium tree strikes a mix of simplicity and intricacy.  A simpler tree with less leaf nodes (maximum splits: 4), coarse tree provides a comprehensive perspective on classification.  Every one of these models was trained to identify the one with the best accuracy.  

 Hyperparameter tuning and feature selections were used to increase model output.  We reduced the dimensionality of the predictor space using the Principal Component Analysis (PCA).  PCA reduced overfit risk by converting the data into a lower-dimensional representation while preserving the most important information.  Additionally done was hyperparameter tuning, in which case split criteria, maximum depth, and number of splits were changed to maximise the performance of the model.  Everywhere

 All three decision tree models were trained following configuring the predictors and parameter adjustment.  Running the dataset through the models under evaluation of accuracy and error rates constituted part of the training procedure.  Further model refinement was done using MATLAB's misclassification cost feature, which contrasts the true class with the projected class (1 for good conditions and 0 for bad conditions).

 Since all six (6) predictors were significant for the analysis, they were all applied in training the models.

 3.5.5 Model Evaluation

The performance of the trained models was assessed by means of comparison on the validation set.  Important numbers examined included:  Accuracy, a gauge of the model's performance, is the percentage of observations the model correctly labels.  On the other hand, the error rate shows the proportion of misclassified observations, so pointing up areas that might want development.  Training time is the length needed to create the model, which might change based on the method and dataset size.  For real-time applications, prediction speed—which measures the model's ability to produce outputs for fresh data—reflects a vital component.  At last, model size indicates the complexity and storage needs of the model, which can affect its scalability in environments with limited resources and deployment capability.

 MATLAB's thorough analysis produced other metrics as well: F1 scores, recall, and precision.  Macro, micro, and weighted averages were used to examine these measures to give a whole assessment of the performance of the model.  Further providing the precision, recall, and F1 scores for the two classes—good and bad engine conditions—a tabular study of per-class measurements.

 Reviewing the hyperparameters of the decision tree—that is, split criteria and number of splits—helps one find the model that performs best.  For example, because of its detailed differences, the Fine Tree model usually offered better accuracy; but the Medium Tree gave a better mix between simplicity and accuracy.

 At last, the desired split criterion was investigated as the Gini diversity index.  Summing the squared probabilities of every class and then subtracting the outcome from one compute the likelihood of misclassification.  This method guarantees that every split makes best use of the purity of the resultant subsets, so improving the predictive ability of the decision tree.  Figures 3.4 display the training results' output.


Figure 3.4 Training Results


3.5.5.1. Model Metrics

Higher values indicate fewer false positives; precision, then, is the fraction of correctly predicted positives among all positive predictions.  With higher values indicating fewer false negatives, recall—also known as sensitivity—reflects the fraction of true positives found among all actual positives.  Combining recall with accuracy into a harmonic mean, the F1 score balances the two measures.

For methods of averaging: Perfect for evaluating general performance independent of class distribution, macro average computes the unweighted mean across all classes.  Treating every prediction equally, micro averages actual values and predictions across all classes.  For imbalanced datasets, weighted average helps to adjust for class frequency.

3.6. Development and Design of Predictive Model

Engine RPM (R), Fuel Pressure (FP), Lubricant Oil Pressure (LOP), Lubricant Oil Temperature (LOT), Coolant Temperature (CT), and Coolant Pressure (CP) are key engine parameters used as inputs in a decision tree framework guiding the development of the predictive model for estimating engine conditions and fault diagnosis.  These values are compared to predefined logical thresholds to determine the Engine Condition (EC) as either GOOD (EC = 1) or Bad (EC = 0).  The structure of the model stresses the main function of Engine RPM as the root node, with secondary parameters offering extra classification criteria.

 The model first looks for low RPM operation of the engine, where it is less stressed, under good engine conditions.  Should this condition be satisfied, the engine is usually categorised as GOOD, given Lubricant Oil Temperature stays within a safe range.  The engine is rated as GOOD at moderate RPM levels if Fuel Pressure, Lubricant Oil Pressure, and Lubricant Oil Temperature fall within ideal operational ranges.  This guarantees correct engine performance by means of sufficient fuel delivery, lubrication pressure, and temperature.  Stricter thresholds are used at high RPM levels, when the engine is under more stress.  Reflecting the growing needs of higher running speeds, the engine remains classified as GOOD only if Fuel Pressure, Lubricant Oil Pressure, and Lubricant Oil Temperature satisfy more exact criteria.

 The model finds failures depending on parameter thresholds for bad engine conditions.  The engine is considered bad at low RPM if the lubricant oil temperature exceeds its upper limit, so indicating possible overheating.  Bad conditions at moderate RPM indicate problems with fuel delivery, lubrication, or thermal control whether Fuel Pressure is inadequate, Lubricant Oil Pressure falls below optimal levels, or Lubricant Oil Temperature falls outside the allowed range.  If any of these criteria fall short of their stricter thresholds at high RPM, the engine is said to be bad, so reflecting the increased vulnerability of the engine under higher operational loads.  Extreme Coolant Temperature or insufficient Coolant Pressure can also cause a Bad Classification across all RPM ranges, so highlighting engine cooling system problems.

 The model combines a root cause analysis ability to identify the parameter or set of parameters causing a bad condition.  For low RPM, for example, overheating brought on by high Lubricant Oil Temperature could be noted as the underlying cause.  Inadequate Fuel Pressure or Lubricant Oil Temperature at moderate RPM might point to a mix of fuel and lubricant system failures.  Likewise, at high RPM, one may find concurrent failures in temperature control, lubrication, and fuel delivery.  This root cause study guarantees accurate fault diagnosis and helps to plan focused maintenance projects.

 Python was used in development of an Engine Condition Analyser to test the model.  Using logical correlations among the parameters, this analyser assesses the engine condition.  It finds whether the engine is in a GOOD or bad condition by accepting inputs for RPM, fuel pressure, lubricant oil pressure, lubricant oil temperature, coolant temperature, and coolant pressure.  Under a bad condition, the analyser finds the parameter or set that causes the failure.  Using the logical framework of the decision tree, the analyser offers a fast and accurate instrument for fault diagnosis and engine condition classification, so supporting proactive maintenance and raising general engine dependability.














CHAPTER FOUR

    RESULTS AND DISCUSSIONS 

4.1. Analysis of Metric and Scores for Decision Tree

The performance metrics, as seen in Table 3 and Table 4 of the Decision Tree model show a balanced overall accuracy of 65.7%, indicating moderate classification effectiveness. The micro-average metrics, which indicate the overall performance, show consistent precision, recall, and F1 values at 65.7%, while the macro-average metrics highlight a slight imbalance across classes with precision at 62.3%, recall at 60.5%, and an F1 score of 60.7%. This imbalance becomes evident in the per-class metrics, where the model performs significantly better on Class 1 with a high recall of 80.4% and precision of 69.8%, achieving an F1 score of 74.7%. In contrast, the model struggles with Class 0, where the recall is only 40.6%, precision is 54.8%, and the resulting F1 score is much lower at 46.7%. This difference suggests a bias towards Class 1, caused by class imbalance or inadequate handling of minority class samples.

Table 4.1 Average Class Metrics


Precision

Recall

F1 Score

Macro

62.3%

60.5%

60.7%

Micro

65.7%

65.7%

65.7%

Weighted

64.3%

65.7%

64.4%

                                                                          


Table 4.2 Pre-Class Metrics


Precision

Recall

F1 Score

0

54.8%

40.6%

46.7%

1

69.8%

80.4%

74.7%


4.2. Decision Tree Performance

Based on a set of fundamental criteria that impact engine performance, the decision tree model classifies engine conditions into two categories: GOOD (Class 1) and Bad (Class 0).  Engine RPM, which shows up as the root node and drives the classification, is the main consideration for engine health.  Using further criteria including Fuel Pressure, Lubricant Oil Temperature, Lubricant Oil Pressure, Coolant Pressure, and Coolant Temperature, later splits improve the forecasts.  Together, these characteristics enable the model to evaluate the operational condition of the engine; each parameter is essential in differentiating GOOD from Bad conditions.  The structure of the decision tree reflects the different significance of every feature, so guaranteeing correct predictions depending on engine performance criteria.  The decision tree's output is shown in Figure 4.2.


Figure 4.1 Display of Trained Decision Tree

 


4.2.1. Analysis of Decision Tree and Breakdown

Structured based on a series of threshold-based splits considering important engine parameters, the decision tree model for classifying engine conditions into GOOD (Class 1) and Bad (Class 0) is Engine RPM is the most important factor since it appears in several internal nodes and acts as the root node, so controlling the classification process. Engine RPM serves as the primary indicator for determining whether the engine is in a GOOD or BAD state. For low Engine RPM values (less than 444.5 RPM), the engine is generally classified as GOOD unless Lubricant Oil Temperature exceeds a high threshold of 88.79°C. This suggests that when the engine operates at low RPM, the engine is less stressed, and Lubricant Oil Temperature only becomes a concern if it is abnormally high. In this case, Lubricant Oil Temperature exceeding this threshold triggers a BAD classification, even at low RPM, indicating that high oil temperature can be a sign of poor engine health, regardless of engine speed. 

When the engine operates at moderate RPM (between 444.5 and 1175.5), the classification becomes more nuanced. In this range, the engine is considered GOOD if the following conditions are met: Fuel Pressure must be greater than or equal to 5.15333, ensuring adequate fuel delivery to maintain engine performance; Lubricant Oil Temperature must exceed 75.2928°C, suggesting that the oil temperature is within an optimal range for maintaining lubrication and minimizing wear; and Lubricant Oil Pressure must be greater than or equal to 2.11803, ensuring that the engine’s lubrication system is functioning properly to reduce friction between moving parts. At high Engine RPM (greater than 1175.5 RPM), the engine is under greater stress and therefore requires more stringent thresholds to remain in a GOOD condition. In this case, Lubricant Oil Pressure must be above 2.67553 to ensure sufficient lubrication under the increased load, and Lubricant Oil Temperature must be above 77.4972°C to maintain proper lubrication at high engine speeds.

On the other hand, BAD engine conditions are identified by a set of specific criteria based on Engine RPM and other key parameters. For low Engine RPM (less than 444.5 RPM), a BAD condition arises when Lubricant Oil Temperature exceeds 88.79°C, suggesting that despite the low engine speed, the engine's oil is overheating, potentially leading to engine damage. At moderate Engine RPM, BAD conditions occur if any of the following happen: Fuel Pressure falls below 5.15333, indicating insufficient fuel delivery and impaired engine performance; Lubricant Oil Temperature drops below 75.2928°C, suggesting inadequate oil temperature for proper lubrication; or Coolant Pressure is below 0.9706, signalling poor coolant circulation that can lead to overheating or engine damage. At high Engine RPM, engines are particularly vulnerable to poor performance when Fuel Pressure falls below 5.03708, indicating insufficient fuel delivery under high engine loads; Lubricant Oil Temperature drops below 75.9197°C, suggesting suboptimal oil temperature for lubrication; or Lubricant Oil Pressure drops below 2.67553, indicating insufficient lubrication, which is especially critical under high stress. Additionally, Coolant Temperature becomes a critical factor at high RPM, as reaching or exceeding 86.2043°C strongly indicates BAD engine conditions, even if other parameters like Lubricant Oil Pressure are within acceptable ranges. High coolant temperature suggests engine overheating, which can lead to severe damage despite proper lubrication.


4.3. Analysis of the Predictive Model

The predictive model for engine condition prediction and fault diagnosis was developed using a series of conditional equations derived from a decision tree. This model incorporated key engine parameters—Engine RPM (R), Fuel Pressure (FP), Lubricant Oil Pressure (LOP), Lubricant Oil Temperature (LOT), Coolant Temperature (CT), and Coolant Pressure (CP)—to determine whether the engine was in a GOOD (EC = 1) or BAD (EC = 0) condition. The model functioned by evaluating the relationships between these parameters based on predefined thresholds determined through the decision tree structure.

For a GOOD engine condition, the model initially checked if the Engine RPM (R) was less than 444.5. If this condition was met, the engine was classified as GOOD without requiring further checks. However, for RPM values between 444.5 and 783.5, additional conditions were evaluated to confirm the engine’s GOOD condition. These conditions included verifying that the Fuel Pressure (FP) was greater than or equal to 5.15333, the Lubricant Oil Temperature (LOT) was greater than or equal to 75.2928, and the Lubricant Oil Pressure (LOP) was greater than or equal to 2.11803. For RPM values greater than 783.5, stricter thresholds were applied, requiring the Fuel Pressure to exceed 5.03708, the Lubricant Oil Pressure to exceed 2.67553, and the Lubricant Oil Temperature to exceed 77.4972 for the engine to remain classified as GOOD.

In contrast, when the engine was classified as having a BAD condition (EC = 0), the model identified the specific parameters contributing to the engine’s failure. At low RPM values (less than 444.5), the engine was classified as BAD if the Lubricant Oil Temperature exceeded 88.7908, indicating a potential overheating issue. At moderate RPM values (444.5 to 783.5), BAD conditions were flagged if the Fuel Pressure was less than 5.15333, the Lubricant Oil Temperature was below 75.2928, or the Lubricant Oil Pressure was below 2.11803. For RPM values between 783.5 and 1175.5, the engine was flagged as BAD if the Fuel Pressure fell below 5.03708, the Lubricant Oil Pressure dropped below 2.67553, or the Lubricant Oil Temperature was below 77.4972. At RPM values greater than 1175.5, the same conditions applied, with failures in Fuel Pressure, Lubricant Oil Pressure, or Lubricant Oil Temperature indicating engine issues. Across all RPM ranges, the model classified the engine as BAD if the Coolant Temperature exceeded 86.2043, indicating potential cooling system failure.

The model also included a root cause analysis function to determine the specific parameter or combination of parameters responsible for a BAD condition. For example, at low RPM values, overheating due to high Lubricant Oil Temperature (LOT ≥ 88.7908) was identified as a root cause. At moderate RPM values, insufficient Fuel Pressure (FP < 5.15333), low Lubricant Oil Temperature (LOT < 75.2928), or low Lubricant Oil Pressure (LOP < 2.11803) were flagged as root causes. For higher RPM values (783.5 to 1175.5 or greater than 1175.5), simultaneous failures in Fuel Pressure (FP < 5.03708), Lubricant Oil Pressure (LOP < 2.67553), and Lubricant Oil Temperature (LOT < 77.4972) were identified as root causes. Additionally, high Coolant Temperature (CT ≥ 86.2043) across all RPM ranges was flagged as a root cause of engine failure.

By assessing thresholds for important variables and correctly classifying the engine as GOOD or Bad, this predictive model essentially predicted its condition.  Furthermore, in cases of bad conditions the model offered thorough root cause analysis, so pointing out the precise parameter or set of parameters causing the failure.  This capacity guaranteed better engine performance and lifetime by allowing exact fault diagnosis and proactive maintenance and repair techniques to be followed.  Figure 4.3 displays the created Predictive Model's code.

Figure 4.2 Code for Predictive Model


4.3.1. Performance Of the Predictive Model

Using real-time input data for key parameters—including Engine RPM, Fuel Pressure, Lubrication Oil Pressure, Lubrication Oil Temperature, Coolant Temperature, and Coolant Pressure—the implemented predictive model efficiently evaluates engine performance conditions.  The model finds whether the engine condition is GOOD or bad using threshold-based rules.  Should a flag indicate Bad, it points to root causes derived from parameter failures. One of the strengths of the model lies in its clear and deterministic rule-based accuracy, which dynamically adjusts thresholds when historical datasets are provided. For instance, thresholds for parameters like Fuel Pressure and Lubrication Oil Temperature are adjusted based on averages and tolerances from the dataset. Similarly, Coolant Temperature thresholds are scaled to account for operational variations.

The model excels in its ability to identify granular root causes by analysing individual and combined parameter failures. It distinguishes between various fault conditions based on operational ranges of Engine RPM: at low RPM (< 444.5), minimal parameter checks are performed; at mid RPM (444.5–783.5), parameters like Fuel Pressure, Lubrication Oil Pressure, and Temperature are monitored more closely; and at high RPM (783.5–1175.5), stricter validation is applied due to increased operational stress. Previously, minor bugs in the logical flow caused the output to default to "Unknown Issue" for partial failures. These issues were resolved by refining the condition checks to ensure that individual and combined failures trigger the appropriate root cause. For instance, the revised model accurately found Lubricant Oil Pressure and Lubricant Oil Temperature as problems at an Engine RPM of 951, so avoiding false negatives.

 The model has certain limitations notwithstanding its strengths.  When no historical dataset is accessible, it depends on stationary thresholds that might not consider engine-specific changes or environmental conditions.  It also lacks predictive power, concentrating just on real-time fault detection and failing to spot trends suggesting future breakdowns.  Though in real-world situations variables like Lubricant Oil Pressure and Temperature are often interdependent, the model also assumes parameter freedom.  Moreover, even if the root cause outputs are obvious, the model might offer more thorough diagnostics including maintenance suggestions or corrective actions.

 As a rule-based engine condition analyser, the model generally performs consistently and efficiently detects parameter failures and their causes.  It avoids errors including the previous "Unknown Issue" outputs by means of dynamic threshold adjustments and improved logic.  Predictive analytics, automating threshold tuning, and accounting for parameter interdependence will help to improve the accuracy and utility of the model even more going ahead.  Although these improvements could make it a complete predictive diagnostic tool for engine performance analysis, right now it is just suited for quick fault detection.  Engine Classifier output is shown in figure 4.4.


Figure 4.3 Output of Predictive Model















CHAPTER FIVE

       CONCLUSION AND RECOMMENDATIONS

5.1. Conclusion

The research successfully analysed the automobile engine health of over 19000 cars. The approach of the research involved modelling a decision tree, designing a predictive model and building an Engine Classification analyser using python which predicts and detects automobile engine faults. The Decision Tree model for fault detection highlights the importance and the role of Machine Learning in Automobile maintenance. 

Machine Learning is allowing us to understand automobile systems much better and covers up the gaps created by the limitations of human intelligence. This research has shown us the strength of Predictive Maintenance in detecting faults. It shows us that the PdM goes beyond telling when faults will occur, but it gives insights to where they occurred and how they occurred, eliminating trial and errors in automobile maintenance. This further highlights the role of big data in improving systems and processes.

From the research, Engine RPM is the most critical predictor of engine conditions, serving as the primary classification feature. At low RPM levels (< 444.5), engines are generally classified as GOOD unless Lubricant Oil Temperature is excessively high. Moderate RPM levels (444.5–1175.5) require additional checks on Fuel Pressure, Lubricant Oil Temperature, Lubricant Oil Pressure, and Coolant Pressure to determine engine health. At high RPM (≥ 1175.5), engines face greater operational demands, necessitating sufficient Lubricant Oil Pressure and Lubricant Oil Temperature to sustain GOOD performance.

Key thresholds determine classification: Fuel Pressure must be ≥ 5.15333 for moderate RPM and ≥ 5.03708 for high RPM to avoid BAD conditions. Lubricant Oil Pressure must meet minimum values of 2.11803 at moderate RPM and 2.67553 at high RPM for proper lubrication. Lubricant Oil Temperature also plays a crucial role, with BAD conditions arising if it falls below 75.2928°C at moderate RPM or 77.4972°C at high RPM. Additionally, inadequate Coolant Pressure (< 0.9706) at moderate RPM or excessive Coolant Temperature (≥ 86.2043°C) can lead to BAD classifications, highlighting the importance of effective cooling.

Overall, while low RPM generally favours GOOD conditions, higher RPM demands optimal performance across all parameters. Any deficiency in Fuel Pressure, Lubricant Oil Temperature, Lubricant Oil Pressure, or Coolant regulation leads to BAD engine classifications, reinforcing the necessity of maintaining optimal operating conditions.

5.2. Recommendation

Based on the insights I gained during this research work; the following recommendation are made for future research work in this line:

These recommendations highlight the importance of Predictive Maintenance methods in the Automobile field, showing us the different applications and methods in different domain


REFERENCES


Albarbar, A., Mekid, S., Starr, A., & Pietrosz, N. (2010). Suitability of vibration and acoustic emission monitoring for diesel engine fault detection. Mechanical Systems and Signal Processing, 24(5), 1500-1517.

Anderson, P., & White, R. (2018). Vehicle telematics and predictive maintenance: The role of sensors in modern automobiles. Journal of Automotive Technology, 15(3), 225-241.

Annamalai, S. (2020). Cloud-based predictive maintenance and machine monitoring for intelligent manufacturing in the automobile industry. IEEE Transactions on Industrial Informatics, 16(4), 2789-2799.

Bagavathiappan, S., Lahiri, B. B., Saravanan, T., Philip, J., & Jayakumar, T. (2013). Infrared thermography for condition monitoring – A review. Infrared Physics & Technology, 60, 35-55. https://doi.org/10.1016/j.infrared.2013.03.006

Banks, J. (2001). Handbook of simulation: Principles, methodology, advances, applications, and practice. John Wiley & Sons.

Banks, J., Carson, J. S., Nelson, B. L., & Nicol, D. M. (2010). Discrete-event system simulation (5th ed.). Prentice Hall.

Bishop, R. (2021). Intelligent vehicle technology and trends. Artech House.

Bosh. (2007). Automotive electrics and electronics (5th ed.). Springer.

Bosh. (2020). Automotive handbook (10th ed.). Bentley Publishers.

Bose, C., Wang, Y., & Kumar, M. (2021). Machine learning techniques for fault detection in automotive systems: A review. IEEE Transactions on Intelligent Vehicles, 6(2), 214-230.

Box, G. E. P., Jenkins, G. M., & Reinsel, G. C. (2015). Time series analysis: Forecasting and control (5th ed.). Wiley.

Breiman, L. (2017). Classification and regression trees. Routledge.

Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. Wadsworth & Brooks/Cole.

Britannica. (n.d.). Gasoline engine - Development of gasoline engines. Encyclopedia Britannica. Retrieved from https://www.britannica.com/technology/gasoline-engine/Development-of-gasoline-engines

Bunkar, R., Sharma, M., & Patel, H. (2021). Data-driven approaches for predictive maintenance in automotive fault detection. Journal of Intelligent Systems, 30(2), 155-170.

Bunkhumpornpat, C., Sinapiromsaran, K., & Lursinsap, C. (2009). Safe-level-SMOTE: Synthetic minority over-sampling technique for handling the imbalanced problem. Pacific-Asia Conference on Knowledge Discovery and Data Mining, 475-482. Springer.

Carvalho, T. P., Soares, F. A. A. M., Vita, R., Francisco, R. P., Basto, J. P., & Alcalá, S. G. (2019). A systematic literature review of machine learning methods applied to predictive maintenance. Computers & Industrial Engineering, 137, 106024. https://doi.org/10.1016/j.cie.2019.106024

Cengel, Y. A., & Boles, M. A. (2019). Thermodynamics: An engineering approach (9th ed.). McGraw-Hill Education.

Chapelle, O., Scholkopf, B., & Zien, A. (2006). Semi-supervised learning. MIT Press.

Chen, L., Wang, Z., & Liu, Y. (2021). Data-driven insights from vehicle sensors: Applications in predictive maintenance and fault detection. IEEE Transactions on Intelligent Transportation Systems, 22(4), 1932-1945.

Cheng, H. (2020). Predictive maintenance in automotive engineering: A review of sensor-based approaches. International Journal of Prognostics and Health Management, 9(2), 112-129.

Cimino, C., Negri, E., & Fumagalli, L. (2019). Review of digital twin applications in manufacturing. Computers in Industry, 113, 103130. https://doi.org/10.1016/j.compind.2019.103130

Crouse, W. H., & Anglin, D. L. (2013). Automotive mechanics (10th ed.). McGraw-Hill Education.

Department of Energy (DOE). (2022). Advancements in hydrogen fuel cell technology. U.S. Department of Energy. Retrieved from https://www.energy.gov/hydrogen-fuel-cells

Denton, T. (2017). Automobile electrical and electronic systems (4th ed.). Routledge.

Duffy, J. E. (2009). Auto electricity and electronics (5th ed.). Cengage Learning.

Environmental Protection Agency (EPA). (2021). Emission standards for vehicle engines. U.S. Environmental Protection Agency. Retrieved from https://www.epa.gov/vehicle-emissions

Garg, S., & Sharma, R. (2020). Vehicle diagnostics and predictive maintenance: Emerging technologies and challenges. IEEE Transactions on Transportation Electrification, 6(3), 1042-1056.

Giakoumis, E. G. (2016). A review of some recent research on internal combustion engine modeling. MDPI Energies.

Giordano, F. R., Weir, M. D., & Fox, W. P. (2013). A first course in predictive modeling (5th ed.). Cengage Learning.

Gonzalez, F., Ramirez, J., & Torres, M. (2020). Automobile telemetry: From data collection to predictive analysis. Journal of Engineering and Automotive Innovation, 10(2), 89-102.

Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer.

Heng, A., Zhang, S., Tan, A. C. C., & Mathew, J. (2009). Rotating machinery prognostics: State of the art, challenges, and opportunities. Mechanical Systems and Signal Processing, 23(3), 724-739.

Heywood, J. B. (1988). Internal combustion engine fundamentals. McGraw-Hill Education.

History.com Editors. (n.d.). Ford’s assembly line starts rolling. HISTORY. Retrieved from https://www.history.com/this-day-in-history/fords-assembly-line-starts-rolling

Hunt, T. M. (1993). Handbook of wear debris analysis and particle detection in liquids. Springer.

Isermann, R. (2006). Fault-diagnosis systems: An introduction from fault detection to fault tolerance. Springer Science & Business Media.

International Energy Agency (IEA). (2023). The future of electric vehicle adoption. Retrieved from https://www.iea.org/reports/global-ev-outlook

International Federation of Robotics. (n.d.). History of industrial robots in manufacturing. Retrieved from https://ifr.org/industrial-robots-history

ISO 13306. (2001). Condition-based maintenance standard: Terminology & definitions. International Organization for Standardization.

Jain, M. (2019). Diagnosis of vehicle health using machine learning techniques. International Journal of Automotive Engineering, 8(3), 214-229.

Jardine, A. K. S., Lin, D., & Banjevic, D. (2006). A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mechanical Systems and Signal Processing, 20(7), 1483-1510. https://doi.org/10.1016/j.ymssp.2005.09.012

Johnson, D. (2019). Advanced automotive sensors and their role in modern vehicles. International Journal of Vehicle Engineering, 18(1), 75-89.

Kalpakjian, S., & Schmid, S. R. (2016). Manufacturing engineering and technology (7th ed.). Pearson Education.

Kaplan, D. (2018). Predictive modeling: Applications with GeoGebra. Wiley.

Kuhn, T. (2019). The Fourth Industrial Revolution and predictive maintenance: Innovations in automation and AI. Wiley.

Kumar, R., Patel, A., & Verma, P. (2018). Advancements in automotive fault detection: A survey on OBD-II and predictive maintenance systems. Journal of Automotive Engineering, 12(4), 231-247.

Kumar, R., Singh, P., & Verma, K. (2018). Decision trees and machine learning models in predictive maintenance for automotive applications. IEEE Access, 6, 58230-58240.

Kumar, S., & Sharma, R. (2021). Automobile sensors: Functions, locations, and applications in modern vehicles. Journal of Automotive Sensor Technology, 12(5), 310-327.

Kurkin, O., Petrov, V., & Smirnov, D. (2011). Efficiency analysis of predictive maintenance techniques for automotive components. Journal of Mechanical Engineering Science, 225(12), 3112-3123.

Lee, H., Kim, J., & Park, S. (2022). Sensor placement in modern vehicles: A study on efficiency and data accuracy. Sensors and Actuators in Automotive Systems, 25(3), 567-582.

Lee, J., Jin, C., Bagheri, B., & Kao, H. A. (2020). Cyber-physical systems for predictive maintenance. Manufacturing Letters, 4(1), 38-41. https://doi.org/10.1016/j.mfglet.2019.10.002

Lee, J., Kao, H. A., & Yang, S. (2014). Service innovation and smart analytics for Industry 4.0 and big data environment. Procedia CIRP, 16, 3-8.

Loh, W. Y. (2011). Classification and regression trees. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1), 14-23.

Loh, W. Y. (2014). Fifty years of classification and regression trees. International Statistical Review, 82(3), 329-348.

MathWorks. (2023). MATLAB and machine learning toolbox documentation. Retrieved from https://www.mathworks.com/help/stats/classification-learner-app.html

Martinez, C., & Jones, B. (2017). The evolution of on-board diagnostics (OBD) and its impact on vehicle maintenance. Automotive Systems and Diagnostics, 8(2), 112-127.

Mazda. (2021). How the rotary engine works. Mazda Global. Retrieved from https://www.mazda.com/en/innovation/rotary

Mitchell, T. M. (1997). Machine learning. McGraw-Hill.

Mobley, R. K. (2002). An introduction to predictive maintenance. Butterworth-Heinemann.

Montgomery, D. C., Peck, E. A., & Vining, G. G. (2021). Introduction to linear regression analysis (6th ed.). Wiley.

Moudgil, V., Singh, G., & Kumar, M. (2023). AI-based fault detection and diagnosis in vehicles: A review. Journal of Intelligent Transportation Systems, 27(2), 215-230. https://doi.org/10.1080/15472450.2023.2166143

Mykich, K., Zavushchak, I., & Savka, A. (2018). Predictive maintenance for automotive vehicle engines in military logistics: A study of model hyperparameters. Military Logistics Journal, 10(1), 75-90.

National Bureau of Statistics (NBS). (2023). Nigeria’s automotive sector report. Retrieved from https://www.nigerianstat.gov.ng/automotive-sector

Patel, M., & Singh, V. (2020). The role of sensors in electronic and mechanical systems: A comprehensive review. Journal of Mechanical and Electrical Engineering, 14(4), 190-203.

Peng, Y., Dong, M., & Zuo, M. J. (2010). Current status of machine prognostics in condition-based maintenance: A review. International Journal of Advanced Manufacturing Technology, 50, 297-313. https://doi.org/10.1007/s00170-009-2482-0

Pulkrabek, W. W. (2014). Engineering fundamentals of the internal combustion engine (2nd ed.). Pearson.

Quinlan, J. R. (1996). Improved use of continuous attributes in C4.5. Journal of Artificial Intelligence Research, 4, 77-90.

Randall, R. B. (2011). Vibration-based condition monitoring: Industrial, aerospace, and automotive applications. Wiley.

Reif, K. (2014). Fundamentals of automotive and engine technology. Springer.

Ribbens, W. B. (2017). Understanding automotive electronics: An engineering perspective. Butterworth-Heinemann.

Routray, S., Mishra, A., & Pradhan, R. (2021). Unsupervised fault detection in automotive sensors using ICA and PCA. IEEE Transactions on Instrumentation and Measurement, 70(4), 1125-1134.

Setright, L. J. K. (2003). Drive On! A Social History of the Motor Car. Granta Books.

Shafi, A., Khan, M., & Rahman, A. (2021). Vehicle fleet fault detection using OBD interface data and machine learning methods. International Journal of Automotive Technology, 22(3), 175-189.

Shao, H. (2017). A deep learning approach for predictive maintenance with sensor data. IEEE Transactions on Industrial Informatics, 13(3), 1213-1221.

Singh, A., Verma, P., & Kumar, R. (2019). Automotive sensors and their applications: A review on current trends and future directions. IEEE Sensors Journal, 19(11), 4121-4133.

Statista. (2023). Global automotive engine production statistics. Retrieved from https://www.statista.com/global-automotive-engine-production

Stone, R. (1999). Introduction to internal combustion engines (3rd ed.). Palgrave Macmillan.

Stone, R. (2012). Introduction to internal combustion engines (4th ed.). Palgrave Macmillan.

Tagawa, T., Nakamura, H., & Watanabe, K. (2021). Structured denoising autoencoders for predictive maintenance in automotive applications. Machine Learning Applications in Automotive Engineering, 12(2), 99-118.

Taylor, J. (2019). Automobile electronics and troubleshooting: A complete guide. McGraw-Hill.

Theissler, A. (2021). Semi-supervised fault detection for vehicle predictive maintenance using one-class and two-class classifiers. Expert Systems with Applications, 185, 115628.

Tsang, A. H. C. (1995). Condition-based maintenance: Tools and decision making. Journal of Quality in Maintenance Engineering, 1(3), 3-17.

U.S. Department of Energy. (2010). Operations & Maintenance Best Practices Guide: Release 3.0. Office of Energy Efficiency & Renewable Energy.

Van Engelen, J. E., & Hoos, H. H. (2020). A survey on semi-supervised learning. Machine Learning, 109(2), 373-440.

Venkatasubramanian, V., Rengaswamy, R., Yin, K., & Kavuri, S. N. (2003). A review of process fault detection and diagnosis: Part I & II. Computers & Chemical Engineering, 27(3), 293-346. https://doi.org/10.1016/S0098-1354(02)00160-6

Walker, J. (2016). OBD-II: Functions, features, and troubleshooting techniques. SAE International.

Williams, A. J., & Fletcher, T. (2018). Automotive sensors and electronic control systems. McGraw-Hill.

Williams, J. H. (2006). Condition-based maintenance and machine diagnostics. Springer.

Wikipedia contributors. (n.d.). History of the internal combustion engine. Wikipedia, The Free Encyclopedia. Retrieved from https://en.wikipedia.org/wiki/History_of_the_internal_combustion_engine

Zhang, L., Wang, H., & Sun, Y. (2017). A review of automotive on-board diagnostics (OBD) and data-driven fault detection systems. Journal of Vehicle Technology, 25(2), 85-103.

Zhang, Y., Li, X., & Wang, L. (2020). Machine learning-based predictive maintenance for smart manufacturing systems. Journal of Intelligent Manufacturing, 31(1), 1-16.