A reflection on how moving from handwritten records in rural China to structured EHRs in the U.S. revealed deeper issues in representation, consistency, and clinical workflows.
I grew up in a rural part of China where every piece of medical information lived on paper—thin, easily damaged, and often illegible. A doctor’s handwriting was sometimes no more than a hurried sketch, looping across the page in ways that few people could decode. For a long time, I believed this was simply part of “seeing a doctor.” You wait, you receive a slip of paper, and you carry it home without understanding what was written.
When I came to the United States, the first shock was not the technology, not the equipment, not even the cost of healthcare—it was the existence of electronic records. Suddenly, a medical encounter was not a single moment sealed in ink. It was a timeline. A structure. A history.
And yet, the more time I spent inside the U.S. healthcare system—first as a patient, then as a caregiver for my mother—the more I realized that structure is not the same as clarity, and digital does not guarantee consistency. If anything, EHRs made visible a deeper issue: clinical documentation is inconsistent, variable, and often not representative of what patients actually experience.
EHRs promise continuity, legibility, and interoperability. But in practice, I saw enormous variation:
• Some clinicians wrote five detailed paragraphs.
• Others wrote two minimal lines.
• Some used abbreviations I had never seen before.
• Others wrote in full sentences but omitted critical context.
• Nurses documented symptoms using one vocabulary; physicians used another.
The structure existed—but the content inside the structure varied wildly.
For my mother, this variability shaped how doctors understood her symptoms, what questions they asked, and which treatments they believed were appropriate. It shaped their assumptions. It shaped their decisions. It shaped her options.
Healthcare is full of invisible choices that happen long before a model is trained or a treatment is recommended. They happen in the moment a clinician decides:
• what to write,
• how much to write,
• and how to phrase what they write.
These choices are not neutral. They determine how systems interpret the patient later.
A symptom described differently in Chinese became something else in English. A note written by one clinician became something entirely different when summarized by another. EHRs offered legibility, but not fidelity.
Most health ML models rely on EHR data. But EHR data is not a static record—it is a product of human workflow, culture, and documentation behavior.
When I later studied applied mathematics and data science, I found myself asking questions few students asked:
• What if model failure comes not from the algorithm, but from inconsistent upstream documentation?
• What if two patients appear different only because clinicians document differently?
• What if the problem is not missing data, but missing structure?
Digitization solves the storage problem. It does not solve the representation problem. And representation is what models learn.
This essay forms the foundation of my research identity:
• documentation variability
• information structure
• cross-cultural interpretation
• how inconsistencies upstream affect decisions downstream
Paper vs. digital is not only a technological shift. It is a representational shift. And representation determines how clinicians—and models—understand reality.
Understanding this is why I study health data science. It is why I focus on information flows—the messy, human parts of healthcare that shape data long before it becomes “data.”