Best Practices for Developing Medical Device Software

Functional Programming

Feb 7, 2018

Best Practices for Developing Medical Device Software

At FP Complete we have experience writing Medical Device software that has to go through

rigorous compliance steps and eventually be approved by a

government regulatory body such as the US Food and Drug Administration (FDA).

In this post we'd like to share some of the best practices and

pitfalls we have learned when working in this area.

You may find this blog post especially relevant if you are

a Software Engineer working on any form of medical software,
a Researcher or Data Scientist trying to turn your research into a product, or
an Engineering Manager or Product Manager trying to deliver a medical product,

but of course you are also invited to read and discuss this

topic with us when you are none of these.

Before we get to the problems and best practices, we'll give

some context by describing what a common project setup for Medical

Device software may look like.

Typical Medical Device project setup

A common team structure inside a company working on a Medical

Device might be:

10 researchers (mathematicians, statisticians, chemists, medical experts, data scientists),
10 software engineers,
a project manager, some people managers, some regulatory experts, and a product owner/business representative.

Together, they want to develop a software product that makes

some form of medical statement (which could be a diagnosis of a

disease, a forecast of how a patient will react to some treatment,

or a recommendation of treatment), or one that takes medical action

(e.g. control logic of a physical device performing a treatment or

administering a medicine).

Because of that, the software will be classified as a “Medical Device” by regulatory bodies such as the FDA or EMA, even though it is just a computer program.

A common project history is:

There was an internal R&D phase in which the key algorithms were
discovered; this phase was free of any form of regulation, mainly
containing researchers and no engineers.
Now the project is entering a productisation phase where the
“device” is built based on that R&D, but many algorithm details
are still unclear.
Researchers need to continually run smaller experiments on
their machines, and from time to time longer batch experiments that
may run over night . They often do so on separately purchased data
sets to check and tune their algorithms.

During the productisation phase, you are obliged to operate in

“regulation-safe mode”, meaning that all processes and decisions

need to be well-informed and documented. They must be able to get

through a regulatory audit, if you do not want to be at risk that

your product will be denied approval by your regulator and thus

cannot be used or marketed.

The regulatory experts on your team will help you with this,

telling you what certifications you'll need to get and in which

order to perform which steps. However, they are typically not

experts at Software Engineering, and will rarely be able to provide

concrete advice on how to do your Software Engineering to support

the “regulation-safe mode” as much as possible.

Best practices and pitfalls

Now that the project setup is clear, let's get into some best

practices to exercise and pitfalls to avoid when working on Medical

Device software.

Special considerations for working in a regulated medical environment

Make a list of “reserved terminology keywords”.

The fields of programming and medical regulation have some

overlap in terminology that can result in disastrous

miscommunication unless special care is taken to avoid this. For

example, “unit testing” may mean two different things from the

engineering and regulation perspectives. Your regulatory expert may

completely misinterpret what regulatory steps you have already

completed when you tell them that you've just finished writing some

“unit tests”.

Disambiguate it to e.g. “engineering unit tests” and “regulatory unit tests” and enforce across your team that

everybody use only these explicitly qualified phrases, and never

“unit tests” alone.

A list of terminology that we found ambiguous between engineers

and regulatory people includes:

unit testing
code review
verification
quality
performance
“the device”

Many companies in the medical space may have no experience with software .

Consequently they may try to apply processes to software that

were designed for other products, and do not apply to software.

A common example is the assumption that after the product is

“done”, it will never change again. This is a sensible

expectation

for a drug.

Of course this doesn't work with software: Continuous modifications

are needed, already for routine security updates. (You can

drastically reduce the frequency of such updates being necessary by

using an advanced programming language such as Haskell, which is designed for safety and

reliability and thus our language of choice for medical software;

however you will never be able to entirely rule out the need for

post-release updates.)

While this is obvious and natural to any programmer, it may not

be to medical experts, and not understood in many medical

companies. You may meet heavy resistance to any form of agile

development model, continuous deployment setup, and frequent code

changes after the release of the software. You should ensure that

you train the managers and medical experts of the project on this

aspect of software before you start the project, define clear

boundaries between “device-software updates” and “security-software

updates”, and set expectations, e.g. that the software may have to

be recompiled and re-deployed should a security update for an

underlying software library be necessary.

Make CI the central point where all work comes together.

Continuous

Integration (CI) means merging everybody's work together

frequently and running automated tests on it. While CI is common in

software teams by now, researchers and data scientists may not be

used to it. They may be more familiar with the workflow of

developing their own, often one-off scripts and programs on their

PCs and rarely sharing the code with their team members, instead

only sharing the results.

For a regulated project, you should enforce that

everyone on the team checks any code ever

produced for any purpose of the project, into source code version

control. That the results produced by this code should be generated

or reproduced on the shared CI servers, as opposed to be generated

only on a researcher's own PC. This ensures that it is recorded

which exact code produced which exact results in which exact

environment, which helps a lot when making regulatorily relevant

statements such as “our experiments have confirmed our thesis X”.

It also speeds up development, because everybody on the team can

see what everybody else does, or get notified by the CI server when

accidentally breaking somebody else's program or workflow. You

should, where possible, refuse to accept results as certain unless

you have seen them produced by your CI server, and train everybody

on the team how to follow this workflow.

Advertise your tools in the right language.

When we as programmers use advanced technical tooling like

Haskell, we

can easily enumerate the various features that will make the

software more correct and reliable. However, these features may

mean nothing to a medical expert, and thus may not be easily used

by your team for advertising or explaining to a regulator why your

software is especially safe. Consequently you should do research on

what terms will be understood by medical experts, and map your

tools and features into their terminology.

For example, if you use a compiler featuring static analysis,

you might explicitly advertise this as a form of “formal software

verification”, which is a term most medical experts are familiar

with.

Here's a list of cool tools we've used in the past that fall

under “formal software verification”:

strong typing
referential transparency (pure functions)
parametricity
generative testing
code coverage
model checking
theorem proving

Unexpected changes are the worst.

Code and product changes

As a programmer, you should:

Try to make all programs deterministic, ideally
up to byte-identical output.This will drastically help you get needed code refactorings past
regulatory review, as you can provide evidence that your changes
did not change the functioning of the device.
- Set up “gold-standard” testing in CI to notice
  any change.Gold-standard testing means that you store the last-approved
  outputs of your algorithms on a (large) data set of inputs in your
  version control system or (if it doesn't fit in there) in another
  form of storage. Each code commit message should then indicate
  whether it is expected to change the results or not. After the
  results have been computed by CI, proceed according to the
  table:
  results are identical with gold-standard results are different from gold- standard commit message does not expect change good to merge not good to merge, investigate why results changed commit message expects change not good to merge, investigate why change didn't have the desired effect possibly good to merge,
  let medical / data science team sign off the changed results, then
  update the gold-standard outputs
  Note how this is different from engineering unit-testing:
  In engineering unit-testing, the programmer defines and
  understands precisely what the output of the algorithm is for each
  single test case. In gold-standard testing, the idea is not to
  understand the output for each input, but to get notified when
  outputs change (independent of what exactly the outputs look like).
  Because of this, gold-standard tests are easier to write: They
  require no thinking effort from the programmer, they only require
  input data to run on.

Make only controlled changes:

Make people announce when they expect a change.
- Roll back any unexpected change.
- Every change must be traceable to a concrete requirement. This
  bit can be done with low overhead by having commits and code
  comments reference issue tracker entries, and the issue tracker
  being well maintained to link together code features with technical
  requirements (“feature X shouldn't crash and be easy to
  understand”), regulatory requirements (“computation X must not
  store user data”), or business requirements (“computation X must
  finish in under an hour”).

Process changes

While software engineers love to upgrade their stack and switch

tools and processes frequently, medical people tend to hate it.

However, there are ways to make them more comfortable with it.

As a product manager or similar role, when you

want to make a process change, stick to a predictable order such

as:

Analyse what change needs to be made
Announce that the team will be moving to a new approach X in the future, with a concrete proposal.
Collect feedback, inviting everyone whose workflow might be
touched by this move to provide input of how and when it should be
done to reduce disruption to a minimum.
Give it a memorable name that people can use for referring to the motion.
Perform coordinated switchover at a pre-announced time, making sure everybody knows about it in advance.

Here is an example:

Let's say it is necessary that data scientists switch their

working environment operating system (OS) from Windows to Linux so

that developers can more easily reproduce their results in the

production software.

Investigate in audience-limited conversations (e.g. with
programmers) whether the data scientists' desktop OS has to be
changed, or whether it is sufficient that they connect to a Linux
machine from their current Windows machines.
Announce that the team would like to move the data scientist
workspaces from Windows to Linux within the next three months, and
present your concrete proposal so far which may include a
video-tutorial based training on how to use the new work spaces
remotely from Windows, as well as a dedicated engineer to help with
the migration.
Collect feedback such as a data scientist saying that some
scripts don't work on Windows. Discuss with this data scientist
(but in public) whether an engineer helping to port these script to
Windows before the move would address that issue. Another data
scientist may point out that the move should be done after
producing results X but before starting feature Y. Refine the
schedule accordingly.
Call the motion “Datasci-Linux”.
Ensure everybody knows that “Datasci-Linux” will be performed in the last week of April.

Team organisation

Have a real ops , tools and help team .

A lean “DevOps”- only approach usually

doesn't work with researchers.

While developers like to control machines and servers themselves

and the team can be made more efficient that way, researchers like

to have their heavy machinery moved by people who understand what

they are doing.

Thus, as a manager, you should make sure that:

Ops should take care of researchers' working environments,
software needed, computing clusters and so on, so that researchers
don't have to spend time on trial and error (unless they
want to learn it).
If a researcher wants to do some overnight computation job, assign them an engineer to execute it properly.
Recurring jobs are coded up so they can be more automated.
Non-software people are surprisingly unfamiliar with that idea and
will happily do the same manual task again and again.
The rule of thumb is: Do it manually 3 times, then code it up
(this is a good rule for general software development, but you may
have to emphasise it especially in a medical environment where
manual procedures that cannot be automated are very common).
Ensure you have people who can continually help with every-day
issues with tools the team uses, and are tasked to train everybody
in using and understanding version control software and the
development model. A lot of time can be wasted if somebody does not
understand how to get their changes in the right place with
git , pushes things to the wrong branch, and so on.

Separate roles

Define ahead of time what role can block what activity to avoid

unnecessary project slowdowns.

As a Project Manager, you should make sure that:

Regulatory people don't use their almost unlimited veto power
to block decisions that are outside of their domain. For example, a
regulatory reviewer should not use their veto to enforce changes
that are irrelevant for regulatory review.
Programmers should be able to block researcher or regulatory
decisions when they are not realisable , such as using a given
method when it cannot be implemented correctly, accurately, or in
time.
You (the Project Manager) are actually able to exercise the
power over the schedule and work items that was given to you. A
Project Manager's responsibility is to ensure realistic estimates,
also at times pushing back against features that executives may
want to see in short time, if, based on programmer or researcher
feedback, they cannot be realised that quickly.

Managing code, processes and documentation

Version control

Enforce that all code be checked into version control. Make no exceptions here. Arrange for personal scrap spaces in version

control, that are clearly marked as not being under the same

scrutiny as “device code”. If you do not do this, researchers and

programmers will not check their experiments into version control,

and the project will suffer. Examples for such scrap spaces are

branches prefixed with wip/ (for work-in-progress), and a personal-workspaces/username directory

hierarchy.

In general, always clearly separate device-code and non-device code. This need not mean that they should be in

independent source code repositories (as that would forbid ensuring

experimental scripts work with the latest version of device-code).

Instead, use other explicit means as separation, such as having one

directory for device , and one for non-device code.

Relatedly, separate the device from the platform needed to run the device (such as deployment

infrastructure and server tools). As mentioned earlier, this is

especially important for infrastructure security updates.

You should optimise version control usage for

efficiency. For example: Have branches with a doc- prefix only run documentation builds, and skip

the big or costly stages other builds may include. People will hate

tools for structured working such as version control and CI if it

makes their workflow slow. Always provide fast ways to do

things.

If possible, use a linear development model in

version control (such as a “rebasing” workflow in

git). In an environment where reproducibility is of utmost importance, being able to do automatic bisections to find regressions is more important than

developers having to resolve more merge conflicts.

Be especially careful with development practices that can scare regulatory people.

TODOs

As a programmer or data scientist,

Don't write : TODO: fix this code .

This may suggest there is a flaw in the device that can make it unsafe, or that it is unfinished.
Assume that regulatory reviewers have no understanding of programming and take you literally by the words you write.

Do write: TODO-ENG: Future performance

enhancement: While this computes the correct result and is safe to

use, we should make this faster by doing XYZ.

For each project, define and document clear criteria for labels

like TODO . For example, you might designate TODO-ENG as a label to mean “irrelevant for the medical device operating correctly, but engineering would like to change this”, and TODO-DEVICE as a label to mean “this must be

changed before the release or next major milestone on the

roadmap”. You can then ensure before the next milestone that all TODO-DEVICE labels are gone.

Ensure everybody (including regulatory people) know which label

means what. Add this information to your documentation. Also see

the next point for more on that.

Enforce documentation for all coding processes

Whenever you make a decision of how things are done in the

project, write it down, ideally in version control.

Don't propagate engineering, review, and other process rules by

word of mouth. One way regulators assess you is whether you stick

to your own processes; they will not be able to find evidence of

you doing so if you haven't written the processes down.

Finding documentation

Only having documentation is not enough. It also needs

to be discoverable.

Use simple and

obvious ways for people to find any documentation they might need.

An approach that works well is to place a README file

in each sub-project's top level directory (of course under version

control), and link to other documents from this entry point.

Use a simple tagging scheme, such as tags in brackets (e.g.

[ALIEN-SALIVA-DENSITY-ESTIMATION]) that allows you to

place textual anchors and references to them in code and

documentation. This is because linking from documentation to

documentation (which may be easier, e.g. using hyperlinks) is not

enough; you will also need to link from code to docs and from docs

to code (and referring to file name plus line number is obviously

not a good choice given that code can move around).

Medical device software tends to have a lot of

documentation, so you will have many links and references in your

project. At the time of an audit, you don't want auditors unable to

follow outdated documentation links. Have your tools team write

tooling to find dangling links and references, possibly also to

produce simple graphs so that you can easily visualise

documentation references.

Interaction between researchers and engineers

You cannot simply throw a bunch of engineers and researchers

together and expect that they will work in perfect symbiosis and

produce the desired results.

In many companies, R&D and Engineering may be separate

departments that may have developed different ways of working and

communicating. This maybe even more true when one of the two sides

is brought in by a different company or via contracting . Bringing

them together often warrants extra planning and being more explicit

than usual when setting up joint workflows.

Define clearly who is in charge at each stage of the project.

In the R&D phase, engineering should likely assist researchers to get good results, quickly.
In the productisation phase, researchers should likely assist engineers to make an excellent product.

Discourage walled-off thinking.

Make clear that the success of the project depends on the

successful interaction between researchers and engineers.

Most importantly, be aware of the the “my side is fine” problem.

Researchers like to think:

These are my preconditions, and
they have to be provided by the engineers. If those are provided,
we'll be fine.

Engineers like to think:

As long as I code up these maths written by the researchers, I'll be safe.

As a result, neither of the two sides makes sure that the

critical preconditions that make the system work are actually

provided.

To avoid this, you should make sure each side understands the

other well, that the interface between them is understood

especially well by both, and that they talk often about it.

Encourage mutual training: Have Researchers train

Engineers to understand their maths, and Engineers train

Researchers to read their code.

Establish critical thinking and a culture where everyone can ask everything.

This is one of the most important bits when trying to make a

safe device.

Allow and encourage any form of understanding question. “Is this safe to do, and why?” should

be a common thing to be heard and written in your project.

Establish that this does not question anybody’s reputation. Employ blame-free evaluation and analysis

techniques.

Summary

Hopefully you have found these insights useful or

interesting.

If you'd like our help with delivering Medical Device software,

don't hesitate to contact us.