Customer Feedback Intelligence System

Jetstar Airways · 2023 – Present

Production LLM system that classifies and routes 100k+ customer complaints, replacing a brittle rule-based pipeline.

The problem

Jetstar's customer complaint routing relied on a fragile rule-based system that struggled with the natural language variability of real customer enquiries. Rerouting between teams was identified as the primary choke point causing resolution delays. The business had articulated a vague need for a better way to handle complaints and internal discovery work was needed to diagnose where the actual breakdown was occurring.

Architecture

Approach

Built a production classification and routing system combining the Claude API with KNN clustering across 100k+ historical customer enquiries. A RAG-based architecture was designed to integrate internal knowledge sources. NPS scores were found to be inconsistent with written feedback text, so scores were deprioritised and a LightGBM model was trained to reclassify sentiment directly from the text signal. Classification confidence thresholds were determined using the elbow method and silhouette scoring. Delivered in collaboration with a Data Engineer, Data BA, Data Scientist, and Power BI Analyst, with SME consultation throughout.

Key decisions and trade-offs

Build vs buy

Third-party vendors including Qualtrics, Hightouch, and Amazon Comprehend were evaluated through a structured PoC. The decision to build was made after comparing results against cost across each vendor, and factored in engineering buy-in which strongly favoured an internal build for long-term maintainability.

Model approach

Three approaches were considered: traditional NLP with advanced feature engineering, prompt-engineered LLMs, and fine-tuned LLMs. Prompt-engineered LLMs via the Claude API were selected based on PoC performance, speed to production, and cost profile relative to fine-tuning.

Confidence thresholds

Classification boundaries were set empirically using the elbow method and silhouette scoring to identify the optimal number of complaint clusters.

Challenges

Data quality

Customer language was often non-specific, and NPS scores frequently contradicted the sentiment expressed in the written text. The score signal was deprioritised in favour of the text, and a LightGBM model was introduced to reclassify scores from the text directly, cleaning the signal upstream before it reached the classifier.

Governance timing

The Data Governance Council process was not engaged during the PoC phase in order to accelerate early delivery. This created friction prior to productionisation and added time to the final release. Engaging governance earlier in future projects is a direct takeaway.

Stakeholder approach

Leadership wanted a fail-fast approach to identify the best path forward quickly. The structured PoC across vendor and build options directly addressed this, providing clear evidence to support the final recommendation.

Outcome

92% classification coverage achieved across complaint categories. Replaced the rule-based routing system entirely in production. Materially improved routing accuracy and resolution efficiency across the complaints function.

My role

Owned the project end-to-end, from discovery and problem framing through architecture, vendor evaluation, team coordination, and production handover. Identified rerouting as the core problem through internal research, defined the technical approach, and presented the build vs buy recommendation to stakeholders.

What I would do differently

I would have engaged the Data Governance Council earlier in the process, during the PoC rather than after. Skipping this to accelerate early delivery created a bottleneck before productionisation that cost more time than it saved. I would also have sought richer data sources from the outset to improve signal quality upstream.

Generative AINLPLLMsProduction ML

PythonClaude APILightGBMRAGKNN clusteringLangChainSnowflakeMLOps