On this page

Case Study

CSL Behring / Fuze

Designing trust in a research data catalog

Role UX Designer (Contract)

Timeline Mar – Oct 2024

Team PM, Design Director, CTO, Engineers, Research Scientists

Platform Internal web app (desktop/tablet)

Shipped Annotations and Comments Page and cell-level discussion anchored to the data, not floating in email. Internal Pilot Ask AI Source-visible retrieval with explicit scope and conservative guardrails. Prototype FuzeAI Workspace Workflow memory and evidence scope for sustained research.

What this proves
I can design trustworthy AI and collaboration patterns for regulated research environments where source visibility, attribution, and verification matter as much as retrieval speed.

Fuze data catalog with Ask AI and in-context annotations — Fuze data catalog with Ask AI and in-context collaboration. A desktop-first research surface where scientists can scan data, verify AI output, and discuss results without leaving the dataset context.

Getting oriented

Fuze is CSL's internal research data catalog. Scientists use it to browse datasets through metadata and dense tabular views. Before this project, the catalog displayed data but offered no way to discuss it in context. Feedback lived in email threads and Slack channels, disconnected from the values being questioned. AI retrieval existed in prototype form, but it still needed trust scaffolding before scientists could act on its output.

I started by auditing the existing catalog surface, CSL's evolving design system, and adjacent AI and collaboration patterns. The goal was to understand where discussion should live, how source visibility should work, and how much of the workflow needed to stay anchored to the dataset rather than moving into a generic assistant surface.

Annotations and Comments (Shipped)

Making research tables conversational

Static data tables offered no way to clarify or question values. I introduced cell-level annotations that allowed users to open comment threads directly within a dataset. Progressive disclosure patterns kept tables readable by showing only one master thread per cell. Multiple action-bar layouts were tested to maintain accessibility at narrower widths and to preserve clarity inside a scan-heavy table surface.

The table workspace also needed structural work beyond comments. I used spreadsheet-informed patterns such as frozen headers, sortable columns, inline indicators, and density controls to support the way scientists actually moved through data. The comment system had to fit into that environment without overwhelming it.

Annotation architecture — page-level vs. cell-level — A collapsible annotation sidebar let scientists discuss values without burying the table. Page-level threads anchor to the dataset as a whole. Cell-level annotations pin to specific values, creating a verifiable audit trail.

Cell annotation flow — select, expand, reply — The core interaction sequence for value-level discussion: selecting a cell opens the related thread, preserves active-cell context, and supports reply behavior without losing orientation.

How I anchored discussion to data

Before locking the annotation model, I looked at how spreadsheets and other table-heavy tools handle notes, comments, and indicators. The goal was not to copy Excel or Google Sheets directly. It was to borrow the interaction logic that already makes dense tabular discussion legible: visible markers, clear attachment scope, and side-panel patterns that preserve the table as the primary surface.

Indicators needed to be discoverable without overwhelming the table
Thread scope had to stay obvious: page, row, or cell
Overlay-heavy patterns were too fragile for dense catalog browsing
Sidebar-based disclosure preserved scan speed while keeping discussion attached to evidence

Pattern references from Excel and Google Sheets helped shape the final interaction model, especially around discoverability, scope clarity, and sidebar-based thread handling.

Why scope had to be explicit

Early exploration suggested that combining page-level and cell-level discussion into a single panel would simplify the interface. Testing pushed the design in the opposite direction. Scientists needed clearer scope boundaries so they could tell whether feedback applied to the full dataset or a specific value. Merging the layers was simpler to build but more ambiguous to use. Separating them added one interaction step but made the system easier to trust.

Reusable comment architecture

The discussion layer also needed to scale beyond one good screen. I built the comment system as a reusable component pattern with state coverage for replies, permissions, counters, and visibility rules, rather than treating each thread view as a custom layout.

Supported different thread contexts without duplicating components
Made permissions and reply behavior easier to test visually
Helped keep annotation UI consistent across page and cell-level patterns
Improved engineering handoff by clarifying state coverage

Reusable comment component with Boolean toggles for reply state, visibility, and role-based controls. Built to support multiple thread contexts without creating one-off screens.

Two layers, two kinds of discussion

Not every question belonged at the cell level. Some comments were about the dataset as a whole, while others needed to stay pinned to a specific value. Keeping those two layers distinct made the system easier to trust and easier to resume.

Page-level comments support broader dataset discussion. Table-level annotations handle questions tied to a specific value. The distinction reduced ambiguity and made follow-up easier to track.

Ask AI (Internal Pilot)

Grounded retrieval, not chatbot UX

CSL was piloting AI retrieval inside the research catalog. The system could answer questions about datasets, but trust depended on visible grounding. The patterns we reviewed across adjacent AI tools were technically capable, but they still treated answers more like endpoints than starting points for investigation. Scientists needed to verify claims against source data, not take generated summaries on faith.

I positioned the assistant as a discovery and orientation tool rather than an authoritative answer engine. Every response showed its sources co-equally alongside the answer, scoped its context explicitly, and made the underlying data inspectable. The goal was not “better chatbot,” but a trustworthy starting point for investigation.

Ask AI — source-visible answer with grounding — Ask AI was framed as an orientation and retrieval tool inside the research workflow. Sources are displayed next to the answer, not hidden behind a collapse. Scientists can verify claims against the underlying dataset rather than trusting the summary alone.

How the assistant stayed grounded

The guardrails were not generic chatbot safety patterns. They were designed around the retrieval and scope issues that made scientists question whether an answer was trustworthy in a research context.

“I don’t know” was treated as a first-class response state, not an error. Scope indicators clarified what data was in play for a given query. The interaction model was designed to avoid extrapolating beyond visible evidence. These patterns came from observing what happened when answers were vague, overconfident, or difficult to trace back to source data.

Discovery over prompting

Rather than expecting scientists to write effective prompts, the interface offered structured entry points. “Find correlations,” “Summarize dataset,” and “Find annotations” reduced blank-field ambiguity and encoded common research intents as guided actions. The surface needed to feel like a research tool with known capabilities, not an open-ended conversation partner.

Structured entry points for dataset exploration. Scientists do not need to be prompt engineers to use the assistant effectively.

How users grounded the assistant

The upload and context-setting flows determined what the assistant could reference. I designed three related patterns:

Session-only uploads for temporary context during a single investigation
Page-level uploads for repeat reference tied to a specific dataset
A file manager surface for visible working-set control across sessions

Each pattern made the assistant’s evidence boundary explicit. Scientists controlled what the model could see rather than guessing what it had access to.

Upload grounding patterns — three approaches — Three related upload and grounding patterns clarified what context the assistant could use, depending on whether the reference set needed to be temporary, page-attached, or managed over time.

Prototype: FuzeAI Workspace (Did Not Ship)

Scaling toward a full assistant interface

This was a directional prototype used to help leadership understand the longer-term platform opportunity. It did not ship.

The sidebar assistant worked for quick queries within a single dataset page. But scientists’ real workflows were multi-session and iterative. They returned to earlier findings, cross-referenced sources, and refined questions as new data arrived. The prototype explored what a fuller research workspace could look like if continuity, evidence scope, and guided exploration were treated as first-class parts of the system.

FuzeAI workspace — full three-rail concept — A three-surface workspace model that separated workflow memory (left rail), prompt-driven exploration (center), and evidence scope (right rail) so researchers could resume work without losing trust boundaries.

Workflow memory (left rail)

The left rail preserved research intent over time. Threads worked as resumable questions, while collections grouped related investigations. Instead of restarting from a blank prompt, scientists could return to an earlier line of inquiry with its context intact.

Left rail showing threads and collections used to preserve continuity across research sessions. Persistent research questions, not disposable chats.

Evidence scope (right rail)

The right rail made the assistant’s working set explicit. Uploaded files, indexed datasets, and active context stayed visible, so the model’s boundary was inspectable instead of implied. This extended the same source-visibility principle used in Ask AI, but at a workspace level.

Right rail showing uploaded files, indexed datasets, and the active evidence boundary for the workspace.

Prompt surface and guided intents (center)

The center surface stayed task-focused. Prompt starters and guided actions reduced blank-prompt ambiguity, while collapsible rails allowed a more focused reading mode when needed. This carried the same discovery-over-prompting principle from the Ask AI pilot into a longer-running workspace model.

Research and Validation

Consolidated findings across the project

Rather than treating research as a separate process phase, this work evolved through ongoing review, prototyping, and moderated sessions with research scientists across approximately eight rounds. Each round fed back into the next sprint through updated prototypes, recommendations, and handoff-ready specs.

The most consequential finding was counterintuitive. Early concepts treated AI as something that should feel seamlessly embedded into the catalog. Testing suggested that this actually reduced trust. Scientists wanted to feel they were deliberately invoking the assistant, not being passively routed through an AI layer. More integration was not more trustworthy. The final direction reflected that: the assistant was a tool you reached for, not a layer you passed through.

Other findings shaped the shipped and piloted work:

Source visibility needed to stay prominent
Annotation scope confusion was the largest friction point in the comment system
Guided prompts worked better than an empty text field
Scientists wanted to understand the assistant’s boundaries before trusting its output

Findings and recommendations translated usability friction into concrete changes around scope, visibility, and follow-up behavior.

Competitive pattern scan used to study upload, chat, and source-visibility models across adjacent AI tools without copying consumer-chat assumptions into a research environment.

Outcome

The annotation system shipped and entered production use. The Ask AI assistant launched as an internal pilot with research scientists. The workspace concept informed roadmap discussions about how Fuze’s AI capabilities could expand over time.

Across approximately eight rounds of moderated usability testing, the design evolved from a more ambient AI concept into a deliberately invoked, source-visible, scope-aware research tool.

The more lasting outcome was a design language for trustworthy AI in regulated research: visible sources, conservative confidence, explicit scope, and user-controlled invocation rather than ambient intelligence.

Appendix

The artifacts below are available for deeper inspection, but they are not part of the primary scroll because they work better as supporting receipts than as core narrative beats.

Annotations receipts

Usability testing script outlining scenarios and prompts used to evaluate annotations, reply behavior, and table-context understanding.

Prototype overview canvas — Full prototype canvas showing how multiple scenarios were organized for authentic discovery and end-to-end testing.

Post-MVP explorations around activity, follow-up, and scalable comment architecture across teams.

Full page-level thread views showing broader dataset discussion patterns beyond cell-level interactions.

Ask AI receipts

Upload step breakdowns — Separate session upload, page upload, and file manager flow steps shown in more detail than the main grounding composite.

Prompt chips and guided next steps — Prompt chips surfaced common follow-up actions at natural points in the conversation, reducing blank-field ambiguity without turning the assistant into an open-ended chat toy.

Workspace receipts

Agent mode checkpoint diagrams — Checkpointed agent-flow concepts exploring plan proposal, scope confirmation, execution, and review in a high-stakes research context.