Yingzi Ye

Cross-Cultural Symptom Translation: The Quietest Form of Data Loss

When symptoms cross languages, metaphors disappear, timelines shift, and cultural meaning evaporates. That evaporation enters the EHR as “low-quality data.” But the loss happened long before the data existed.

1. Symptom language is not universal

Many multilingual patients do not describe symptoms in a way that fits English clinical expectations. My mother, like many Chinese patients, used:

• metaphors • imagery • spatial descriptions • comparisons to weather, objects, or textures

These were rich, precise descriptions—within our cultural framework.

But interpreters rarely translated them literally.

2. Metaphor → simplification → data loss

My mother might say:

“Like something tightening from the inside.”

The interpreter would translate:

“She has abdominal pain.”

A metaphor collapses into a category. A specific experience turns into a generic symptom label.

3. Cultural hesitation becomes missingness

Many patients feel uncomfortable describing:

• bowel issues • sexual symptoms • emotional distress • long-term patterns

These “hesitations” appear in datasets as:

• missing fields • incomplete symptom lists • underreported histories • shortened notes

The data looks incomplete because the interaction was incomplete.

4. Pain is culturally expressed, not numerically fixed

Pain scales (0–10) assume numerical reasoning + cultural comfort with self-report. Many Chinese patients avoid strong numbers unless the pain is severe.

My mother often said “3” when she was at an American “7.” The pain scale produced structured numbers—but not accurate ones.

5. Emotion is lost first

Emotional context rarely survives translation:

• anxiety becomes “concern” • fear becomes “unsure” • distress becomes “uncomfortable”

Emotion is clinically relevant—but clinicians rarely see it.

6. Why this matters for research

EHR data for multilingual patients often appears:

• inconsistent • vague • short • “noisy” • hard to model

But these issues do not reflect the patient. They reflect:

• cultural mismatch • translation loss • interpreter simplification • system constraints

This motivates my interest in:

• multilingual representation in EHR • upstream data loss • cross-cultural documentation behavior • equity in health data systems

Because multilingual patients are not “noisy”—the system is.