Kyaw Min Thu

System-Level Controls Engineer — Multi-Domain Industrial Systems

Focused on solving real system failures across control, mechanical, and process domains

Expertise PLC · BMS · SCADA

Background Oil & Gas · Fire Systems · IPC

Approach Isolate · Validate · Fix

01

Section One

Fire Training Simulator

Burner management · SCADA integration · Safety-critical commissioning

Fire Systems

Fire Training Simulator

Burner management system (BMS) design and programming
PLC-controlled gas, air, and ignition sequencing
SCADA integration for operator visibility
Commissioning alongside fire service teams
Safety-critical interlocks and shutdown logic

Field Work

Hands-On System Integration

On-site instrumentation installation and calibration
Sensor loop testing and signal verification
Valve and actuator commissioning
Integration testing across control, mechanical, and process layers

02

Section Two

Field Experience

Offshore operations · Hydraulic top drive systems · Multi-domain integration

Field Commissioning

Offshore Systems Experience

Hydraulic top drive systems on offshore drilling rigs
Installation, commissioning, and hands-on troubleshooting
Mechanical, hydraulic, and control system integration
Operating in high-risk, safety-critical environments
Working with multinational field teams under pressure

Key insight from field: This is where I learned that failures are most often caused by the physical system — not control logic.

System Integration

Multi-Domain System Thinking

Real systems require coordination across multiple engineering domains simultaneously.

Mechanical Hydraulic Electrical Control / PLC Human Interface

"Failures are rarely isolated to one domain. Identifying which system layer is actually responsible — before making any changes — is the first step."

03

Section Three — Core Competency

Root Cause Analysis

Structured isolation · Data-driven thinking · System-level reasoning

RCA Approach

How I Approach Complex Failures

Principles

Avoid assumptions — verify the actual condition
Separate symptom from root cause
Validate with data, not intuition
Isolate domains before changing anything
Do not replace components blindly

Why This Matters

Control system can behave correctly while the process fails
Intermittent issues often indicate process instability
Random fixes waste time and mask the real cause
Instrumentation gaps hide root causes entirely

7-Step Framework

Structured Investigation Process

1

Define the symptom clearly — when, how often, under what conditions

2

Identify the affected domain — control, mechanical, fluid, electrical, or process

3

Collect data — logs, signals, pressure readings, valve timing

4

Isolate variables — verify each domain independently

5

Test hypothesis — confirm the suspected cause is repeatable

6

Confirm root cause — validate with evidence, not assumption

7

Implement fix + validate — correct, test, then prevent recurrence

RCA Case Study

Intermittent Fire Ignition Failure

Problem Statement

Fire ignition failed intermittently
System required multiple attempts to start
No clear pattern — appeared random
No alarms triggered on the control side

Initial Observations

Electrical trigger signal present and correct
BMS responded as expected
Gas valve and air valve actuated normally
No pressure instrumentation available — no visibility

Key insight: Everything in the control layer appeared correct. The issue had to originate in the physical process layer.

Failure Path

Why Ignition Was Failing

Compressor

Air Source

→

Air Filter
Heavily Clogged ⚠

Insufficient air pressure → ignition fails

→

Air Valve

Opens (correct)

→

Burner

Ignition Phase

Why It Was Intermittent

Compressor needed time to rebuild pressure after each attempt. Eventually pressure was sufficient — so the system would "self-recover," masking the root cause.

Why It Was Invisible

No pressure transmitter was installed. The control system had no way to see the pressure drop. All signals looked correct.

Steps 2–4 · Investigation

Domain Isolation Approach

Domains Verified Independently

Control / BMS ✓ Valves / Mechanical ✓ Air Supply — No data Ignition Electrical ✓

Control, mechanical, and electrical layers all verified correct. The uninstrumented air supply became the sole remaining domain.

Pattern That Confirmed Direction

Failures correlated with delayed ignition cycles
System recovered on its own after multiple attempts
Recovery pattern indicated process-side instability
Logic errors do not self-recover — process conditions do

Steps 5–7 · Resolution

Root Cause, Fix & Outcome

Root Cause

Air compressor needed time to build sufficient pressure
Air filter heavily contaminated — restricting airflow
Insufficient air pressure at ignition phase
No sensor → failure invisible to the system

Corrective Actions

Replaced contaminated air filter
Added air pressure transmitter
Integrated pressure monitoring into SCADA
Implemented preventive maintenance schedule

Before

Multiple attempts required · Intermittent failure

→

After

Ignition immediate and stable · Zero recurrence

Engineering Lessons

Key Takeaways

Lack of instrumentation hides root causes — add visibility first
Control system may behave correctly while the process fails
Intermittent issues often indicate process-side instability
Preventive maintenance is critical for reliable operation

"Many failures originate from the physical process, not the control system. Without proper instrumentation, these issues are invisible until they become critical."

One-Line Summary

"Root cause: insufficient air pressure due to a clogged filter — resolved by adding pressure monitoring and preventive maintenance controls."

Offshore Reinforcement

Real-Field Debugging — Same Discipline

Limited documentation in offshore field environments
Real-time pressure to restore operation quickly
Rapid isolation: mechanical, hydraulic, or control?
Cross-domain failure sources with no initial pattern
Decision-making with incomplete information

The same domain-isolation discipline applied offshore — before the fire systems, before the formal framework. The approach came from real field necessity.

RCA Case 2

Intermittent System Shutdown — No Observable Cause

Problem Statement

System stopped unexpectedly with no alarm
Occurred intermittently — daily or a few times per week
Not reproducible on demand
Standard diagnostics returned no fault condition

Key Challenges

No logging or historical event data available
PLC trending insufficient — event rarity made capture impossible
Could not observe the trigger condition in real time
Unknown whether source was control, process, or human

Key decision: PLC trending was insufficient due to event rarity — I needed to implement continuous external monitoring before any diagnosis was possible.

Investigation Strategy

Building Visibility to Find the Fault

Monitoring Scope

All stop conditions and stop commands
All permissives and run conditions
All state transitions — continuously
Any event that could prevent or interrupt operation

Implementation

Implemented a continuous monitoring system in Python
24/7 signal capture — no event missed regardless of timing
Real-time alerting via Slack on any stop event
Correlated multi-source logs to identify timing patterns

"When the existing tools are insufficient, build the visibility layer first — then diagnose."

Steps 5–7 · Resolution

Root Cause, Fix & Outcome

Root Cause

Intermittent manual input — unintentional button press by associate
Not previously capturable due to complete absence of event logging
Human interaction as a hidden failure source — not suspected initially
Only visible through continuous real-time monitoring

Corrective Actions

Identified exact triggering condition through monitoring data
Monitoring and alerting system permanently retained
Improved system observability — all stop events now visible
Enabled immediate response and verification on any future event

Before

Random unexplained stops · No data · No diagnosis possible

→

After

Root cause confirmed · Full event visibility · Zero recurrence

RCA Summary

Two Types of Failures — Two Different Approaches

Type	Case	Approach	Key Tool
Process failure	Fire ignition failure	Physical system analysis — domain isolation across control, mechanical, and process layers	Add instrumentation · Pressure transmitter · PM schedule
Intermittent / human-system	Random system shutdown	Continuous monitoring — capture what existing tools could not see	Python monitoring system · Real-time Slack alerting · 24/7 logging

"The root cause of the second case was an intermittent manual input — solved by implementing a 24/7 monitoring system to capture and correlate stop conditions in real time."

04

Section Five

Relevance & Fit

How this background maps to complex multi-physics systems

Why This Transfers

Multi-Physics System Parallels

My Experience

Hydraulic + mechanical + control integration (offshore)
Gas + air + thermal + ignition sequencing (fire systems)
Failures from the process layer, not control logic
Structured domain isolation under real operating pressure

Complex Multi-Physics Systems

Motion + gas + thermal system coupling
Failures may originate outside control logic
Requires domain isolation and data validation
Instrumentation gaps are risks — visibility is critical

"Where gas flow, thermal conditions, and control logic interact — identifying the correct domain is the first and most critical step."

I focus on systems where control decisions directly impact physical outcomes.

Real systems · Real failures · Real consequences.

Built Complex Systems

Operated Harsh Environments

Solved Root Causes