Amazon PackApp Redesign, Edomiyas Beyene

01

The scale of a broken interface

Amazon's fulfillment network depends on packing software to tell associates what to scan, what box to use, what labels to apply, and when a shipment is complete. The associates doing that work spend their entire shift inside one tool: PackApp.

PackApp had not meaningfully evolved with years of operational change. New equipment, new materials, new process paths, regional variations, and a more diverse workforce had all been layered onto an interface originally designed for a simpler operating model. Associates spoke dozens of languages. They had varying levels of tech literacy. Many were new to the job, cross-trained across stations, or working in physically demanding environments for long shifts. The interface did not support that reality.

The numbers showed the cost of complexity: Pack had higher training time than comparable process paths, fragmented workflows across 20+ pack modes, and rising operational cost per unit. The problem was not just visual debt. It was cognitive load at industrial scale.

20⁺

Fragmented pack modes across the fulfillment network

88^hr

Hours to proficiency in Pack compared with 43hr in a comparable process path

$0.38

Per-unit pack cost in 2022, up from $0.26 since 2017

12^FCs

Research conducted across North America, Europe, and Japan

02

Seeing the real problem

Before designing anything, I needed to understand what was actually broken. I co-led a 12-week discovery phase across 12 fulfillment centers in North America, Europe, and Japan. The research included field observation, heuristic evaluation of core pack process paths, stakeholder interviews across 6 Amazon teams, and usability and customer research sessions with 104 participants across 6 countries.

I also conducted a full UI audit, mapping screens across 20+ pack modes, cataloging repeated components, identifying inconsistent patterns, and building an Associate Map: a complete view of every digital and physical step a packer takes in a shift.

What I found was not a visual design problem. It was a cognitive load problem at scale. Associates were not failing because they did not understand packing. They were failing because each pack mode forced them to rebuild their mental model. The same job appeared as different interfaces depending on station type, region, fulfillment type, or process path.

Pain 01

"I don't know which mode I'm in until something goes wrong."

Associate, FC-PDX5, Portland OR

Pain 02

"The codes mean nothing to me. I memorized them, but I still guess."

Associate, FC-LGB8, Los Angeles CA

Pain 03

"When I'm cross-trained to a new station, I feel like it's my first day again."

Associate, FC-MAN1, Manchester UK

Research synthesis and persona artifacts

03

The decision that mattered most

PackApp had accumulated more than 20 distinct pack modes over time. Each mode existed for a reason: station variation, regional requirement, packaging type, fulfillment type, or operational edge case. The instinct from leadership was to visually modernize the interface while preserving the underlying structure. I pushed for a different direction.

The modes were not fundamentally different jobs. They were the same job with variation: scan item, select packaging, complete shipment. The underlying logic could be unified into one adaptive interface that surfaced only what was needed for the current task. That became the core product decision: unify 20+ pack modes into one adaptive interface, with operational variation handled in the logic layer instead of the UI layer.

The argument

This was not a small design recommendation. It had engineering cost, migration risk, and operational sensitivity.

To make the case, I brought field research, the Associate Map, cross-training friction data, and usability session evidence to leadership reviews. The argument was not that the design was outdated. It was that the interface architecture was creating errors, slowing training, and making every mode switch feel like a new job. Usability testing confirmed that associates made significantly more errors when the UI pattern changed, even when the underlying task stayed the same. The fragmentation itself was the failure mode.

That evidence shifted the conversation from visual refresh to system redesign.

The friction

The initial brief was a visual refresh. The research made the case for a system redesign.

The initial brief called for a visual update: new colors, cleaner typography, same underlying structure. Twelve weeks of field research showed that the visual layer was not the problem. The mode fragmentation was. I brought the Associate Map, cross-training error patterns, and usability session evidence to the leadership review. The argument was not that the design was ugly. It was that the architecture was causing measurable harm and that a deeper redesign was the lower-risk path forward.

04

Five decisions that changed the interface

The redesign was built around five compounding decisions, each tied directly to a failure mode found in research.

01 /

One task per screen

The old interface showed competing panels at the same time. Associates had to scan, read, compare, and decide under time pressure. The redesign put one primary action at the center of the screen and collapsed secondary information unless it was needed. This reduced decision noise and made the next action obvious.

02 /

Replace codes with visuals

Alphanumeric codes like "1BF" made sense to veteran associates who had memorized them, but they created friction for new associates and cross-trained workers. The redesign paired packaging instructions with illustrated box types, visual dimensions, and directional placement cues. This made the correct action easier to recognize without relying on memorization.

03 /

Show progress

The old workflow gave associates little sense of progress within a tote or batch. The redesign added persistent progress feedback, such as "0 of 5 shipments processed," giving associates a clearer sense of pace, completion, and control.

04 /

Make completion states useful

The old completion state was easy to miss and did not clearly guide the next action. The redesign created a stronger completion moment with a clear visual state and explicit next instruction. The goal was not celebration for its own sake. It was to reduce hesitation between tasks.

05 /

Directional illustrations for complex steps

Text-heavy instructions created accessibility and language barriers. The redesign used directional illustrations to show what to do and where to do it. This reduced dependency on trainer explanation and improved comprehension across languages, literacy levels, and physical station setups.

05

The tradeoffs I made

Every meaningful decision had a legitimate counterargument. I documented each tradeoff and used validation to reduce risk before production handoff.

GAINED

One adaptive interface

Consolidating 20+ modes into one adaptive interface reduced mode-switching friction and created a more consistent mental model. Tradeoff: A more complex logic layer and a longer engineering migration. Why I held the decision: Usability testing showed that associates made more errors when the UI pattern changed, even when the underlying task stayed the same. The fragmentation itself was the defect source.

GAINED

Visual packaging guidance

Replacing memorized box codes with visual packaging guidance improved first-time comprehension. Tradeoff: Veteran associates who knew the old codes needed a brief reorientation period. Why I held the decision: Testing showed that new and cross-trained associates completed packaging decisions more confidently when visual guidance was present.

GAINED

One primary action per screen

Simplifying each screen around one primary action reduced scanning, hesitation, and competing attention. Tradeoff: Power users lost some density and simultaneous visibility. Why I held the decision: The old dual-panel layout supported expert speed in some cases, but it created avoidable errors for new, cross-trained, and multilingual associates. The safer design was the more scalable design.

ACCEPTED RISK

Separating validation from launch impact

Because the redesign was validated before broader rollout, I did not position the outcome as shipped business impact. Instead, I framed the work around validation quality: usability testing, pilot-site instrumentation, stakeholder alignment, and production readiness. That distinction mattered. The goal was to show confidence without overstating causality.

06

What we validated before deployment

Validation drew on three sources: multiple rounds of usability testing measuring task completion, time-on-task, and first-time task success against the existing interface; single-site Pendo instrumentation tracking behavioral data at one fulfillment center across core pack workflows; and post-task confidence scoring capturing associate clarity and preference after each session. 48+ associates participated across usability rounds, with a 12-associate cohort tested head-to-head against the current interface. The figures below reflect what was measured in testing. This direction was validated for production investment and handoff; it was not deployed at scale.

The strongest signal was not that the new interface looked cleaner. It was that associates understood the workflow faster, made fewer avoidable mistakes, and needed less explanation to complete unfamiliar tasks.

15^%

Faster task completion across core workflows, measured at the instrumented pilot site

25^%

Improvement in first-time task success across usability testing rounds

30^%

Increase in perceived clarity and confidence, measured through post-task feedback

20⁺

Pack modes represented in one adaptive interface model

07

What I learned

This project changed how I think about senior UX design. At this scale, good design is not about elegance. It is about reducing risk for the person doing the work at the edge of the system: the associate packing for eight hours, in a second language, on their third station of the day, under operational pressure. Designing for that person forced me to think beyond screens. I had to connect field research, workflow architecture, business cost, training time, accessibility, engineering complexity, and leadership risk into one product argument.

The biggest lesson was that seniority is not taste. Seniority is knowing when the brief is too small, building the evidence to prove it, and creating enough confidence for the organization to make the harder but better decision.

Redesigning the interface behind Amazon's packing workflow

The scale of a broken interface

Seeing the real problem

The decision that mattered most

Five decisions that changed the interface

The tradeoffs I made

What we validated before deployment

What I learned