Models are visible, prestigious, and publishable. But the quiet decisions about how information is structured often matter more than the choice of algorithm.
In applied mathematics and data science, we spend a lot of time thinking about:
• model choice • optimization methods • regularization • performance metrics
But my experience in healthcare—especially as a caregiver for my mother—keeps pulling me back to a simpler, more fundamental question:
What exactly are we modeling?
If the underlying information is poorly structured, no model choice can fix that.
In modern healthcare, the EHR does not just record care. It shapes it.
• If a symptom is not documented, it often does not “exist” for future clinicians. • If a timeline is documented incorrectly, the history is rewritten. • If pain is described vaguely, its severity is downgraded in practice.
By the time data scientists see EHR data, it has already passed through:
• multiple people • multiple translations • multiple workflows • multiple simplification steps
A model trained on this data is learning from a layered, transformed, partial reality.
Many inequities in health data are not just about sample size or missing values. They are about:
• how fields are defined • which options exist in a dropdown • which languages are supported • which symptoms are easy to document • which ones require extra work
Multilingual and low-resource patients often appear as:
• “incomplete records” • “high missingness” • “irregular follow-up”
But these labels often reflect structural barriers:
• interpreter quality • clinic time constraints • cultural mismatch • poor interface design
The structure of the information system encodes inequality before any model is trained.
A more sophisticated model can:
• fit patterns more flexibly • handle nonlinearity • work with high-dimensional input
But it cannot:
• recover details never documented • infer nuance that translation erased • reconstruct timelines that were mis-entered • correct for systematically biased workflows without careful design
In other words:
Models amplify the structure they are given. They do not repair it.
This is why I am drawn to questions such as:
• How should symptom fields be structured to reduce ambiguity? • How can we better capture uncertainty instead of forcing false precision? • How might we represent multilingual history in a way that preserves nuance? • Where should free text vs. structured fields be used?
These are design questions, not just engineering ones.
In my tutoring, designing flashcards taught me that representation can transform understanding. In healthcare, I see the same pattern at a higher stakes level.
My emerging research identity is less about inventing the most complex model and more about:
• understanding documentation variability • tracing information flow through clinical workflows • studying multilingual data inequity • designing clearer, more humane information structures
I want to work in the space where:
• clinical practice • information design • data science • equity
meet.
Because the structure of information determines:
• what gets measured • who gets represented • which questions can be asked • which answers seem valid
And in that sense, information structure is not just a technical detail. It is a quiet, powerful form of decision-making.