Devops

Devops

Devops

Apr 5, 2018

FinTech best practices: DevOps Priorities for Financial Technology Applications

FinTech best practices: DevOps Priorities for Financial Technology Applications

FinTech best practices: DevOps Priorities for Financial Technology Applications

People used to think software development was “done” when the

code was written and passed all its tests. But modern IT systems

aren’t done until they are online, running, and integrated with

their data feeds, storage, networks, and administration systems.

This can involve very elaborate operations steps, such as

dynamically creating a whole array of virtual servers, storage

devices, accounts, software configurations, and network

configurations.

These days, almost every

company wants DevOps to automate such powerful infrastructure,

speed up release cycles, and improve reliability and uptime. As an

ultra-high-value and rapidly changing industry, Financial

Technology lives near the cutting edge of innovation. Its DevOps

can include continuous integration and continuous deployment

(CI/CD), automated testing, containerization (as with Docker and

Kubernetes), system monitoring, use of cloud features (like AWS,

Azure, or a private cloud), virtual private clouds, extensive

firewalls, advanced network security, and more. FinTech IT projects have

a lot at stake, and wise engineers will hand-pick DevOps priorities

to match the project’s objectives and exposures. Let’s look first

at what FinTech overall should expect from DevOps, and then at how

different subfields should emphasize additional, specialized DevOps

requirements. What every FinTech solution needs from DevOps Compared to other

industries, FinTech places an unusually high priority

on:

  • Maintainability: improved analyses, features, and data

    integrations must roll out very frequently, with very low

    latency

  • Quality: an uncaught mistake or security hole quickly runs to millions of US Dollars in cost, sometimes much more

  • Data integration: FinTech applications are fundamentally about

    digesting a never-ending stream of new information, and the more

    feeds (or the more atomic inputs) that can be handled, the

    better.

DevOps is

maintainability’s best friend. As far back as 2013, here at FP

Complete, we were releasing large upgrades to major server

applications every few weeks. These days, at large Internet

companies it is routine to see daily release cycles, and faster is

quite doable. FP Complete recommends a fully automated Continuous Integration and Continuous Deployment (CI/CD) system, including automated builds, an

automated test suite, immutable containerized servers, and

post-deployment health checks with rollback capability (Blue-Green

deployment. If you haven’t implemented containers yet, you almost certainly should. FP Complete also

recommends a formal Quality Assurance

system.  Data integration can be

quite application-specific, but FP Complete recommends choosing a

very small number of supported data formats, and having a clean

layer providing these formats after data ingestion, normalization,

and cleansing from a more diverse set of inputs. A service-oriented

architecture (SOA) can make it easy to add new data-feed parsers in

a completely language-independent manner, ensuring system

extensibility. (See my comments on “Modular Design” here. Automated deployment and

monitoring let you have more services running as separate

processes, by eliminating the need to manually examine each

server’s status constantly. So DevOps can be a great

help to FinTech in general. But there is still far more to be had

-- and our next priorities depend on which kind of FinTech solution

we are building. DevOps for Cryptocurrency Cryptocurrency systems

are of course sensitive to attacks in which a person attempts to

steal the coins. Numerous real-world losses have been traced to a

failure to implement proper DevOps, leaving opportunities for

criminals. If you’re implementing

or trading a cryptocurrency, here are some DevOps issues to focus

on right away:

  • Automated testing. If your build system allows

    you to release code that has not been through your test suite -- or

    worse, allows you to be unsure whether the released code

    was tested -- you are taking undue risk. Quality assurance

    automation should be a core part of your build system. This is even

    truer if you are using CI/CD, where code improvements may be

    released quite frequently. “I write quality code in the first

    place” is great, but it’s not a substitute for automated testing.

  • Component isolation. To minimize the chances

    of malfeasance, sensitive systems should be modular in design,

    and unrelated components should run in separate processes --

    ideally in completely separate VMs separated by firewalls. A

    defect, code injection, privilege escalation, or social-engineering

    attack on one service or component should still be unable to tamper

    with another.

  • Storage redundancy. It’s amazing to think that

    some people implement trading and coin-storage systems without

    redundant storage. With many cryptocurrencies, your coins can be

    permanently lost if a unique code number, only a few

    kilobytes in length, is lost. Use automated deployments on the

    virtual cloud to ensure that all your trading and management

    systems are always deployed to redundant cloud storage with

    inherent fault tolerance and permanent backups.

  • Separation of roles. The

    amount of value accessed by some cryptocurrency components is so

    high that you must consider the impact of a compromised person. If

    your deployment architecture has a single “admin” role that gives

    one person the ability to deploy code, access storage, turn off

    monitors, and change audit logs, you asking for that person to get

    into trouble. Don’t tempt anyone to pressure your staff: make it

    impossible for any one person to change where large sums of money

    go, or at least make it impossible for them to do so

    without setting off alarms. Create different admin roles for

    different system components and layers -- roles that are not

    available to the same person at the same time.

DevOps for automated trading

If you are trusting your computer system to move and trade

assets autonomously, you need absolute correctness, inviolable

security, and very rapid response to trouble incidents. In addition

to the concerns I listed for cryptocurrency , pay attention to

these DevOps priorities:

  • Automated load testing and regression

    testing. Since you are likely to update your trading algorithms

    many times a year, there are many opportunities to introduce

    performance problems. If a slow trade can be worse than no trade,

    no build should be allowed to go into production without automated

    performance testing under heavy simulated load. It’s not

    enough to say “my code runs fast,” you need to be able to say “my

    whole deployed system runs fast, even when bombarded by fast

    inputs.”

  • Immutable servers It is incredibly

    tempting to patch production systems with improvements. But without

    careful controls, this leads to having production servers in a

    state that is completely unprecedented in your test

    environment. Instead, use automated deployment to create new copies

    of your servers with the new code already in place -- and when

    these pass your test suite, swap them in, make sure they’re up,

    then shut down and delete the old unpatched virtual servers. This

    kind of roll-forward can be completely automated with tools like

    Kubernetes, and can take advantage of cautious switch-over

    techniques like blue-green deployments or canary deployments.

 

Download our new guide

DevOps for human-assisted trading

This is a relatively forgiving application if your users are

in-house or otherwise very tolerant of imperfection. (If you’re

providing trading services to external parties, you can expect to

be held to a very high standard.) Ask yourself, “what is the cost

of a typical failure to our business?” Systems with a human in the

loop are sometimes more error-tolerant than systems without a human

in the loop.

However, you will need to think much more about usability

testing, because a confusing UI update can introduce human error;

and your automated test suite should include tests that drive the

system through the UI, to detect coding bugs in that layer.

Human users of FinTech systems are often very powerful people,

and a small number of unhappy users can make for a very bad day. So

in addition to extensive testing, your DevOps practices may need to

include gradual deployment of new versions to a test population of

a few users (canary deployments), and support for both halts and

rollbacks in case of significant trouble reports.

DevOps for asset valuation, market analysis, and research

These tasks often amount to a medium amount of math, performed

on a large number of input feeds and databases. Many firms

construct unique asset valuation formulas by insightfully combining

data that no one else was combining, or doing complex combinations

with uniquely clever functions and formulas. Competitive advantage

comes from generating unique insights, and these come from the

ability to scale up innovative formulas and innovative data

integrations quickly.

At FP Complete we regularly hear from FinTech firms that have

built important analyses running on just one or a few desktops, who

need these scaled up to a reliable server-based system. Beyond all

the usual software engineering and cloud deployment problems, a key

concern is maintainability . Analysts are used to updating their

formulas several times per month and then sharing them with

colleagues. And it’s important that new versions don’t always

offline the old versions, which a colleague may still be using.

A version control system attached to a CI/CD system works

wonders for safe maintainability. But it needs to be coupled to a

simple metadata system that keeps track of which versions are now

running at which addresses -- and which allows versions with zero

remaining clients to be shut down.

DevOps for consumer banking and account management

These applications require an exceptional amount of integration

with legacy systems, some very old. They have to maintain an

extremely consistent user interface, for use by clients who can be

upset by the unnecessary change. They have all the requirements of

an e-commerce application, such as resistance to sudden surges in

demand. And they are subject to extremely large-scale Web-based

security attacks, as the payoff for a successful criminal break-in

could be enormous.

DevOps for voting

Voting, such as for shareholder votes or Board of Directors

elections, is a particularly sensitive subject, with huge decisions

being made and significant legal exposure. Ordinary voting is

rather similar to consumer banking and e-commerce (using votes

Computer Voting - Small

as the currency), but where governance rules require

anonymity, standards increase enormously. You must earn voter

confidence, and your systems should be able to pass a really

rigorous audit, including against insider malfeasance, while

protecting the privacy of each voter.

For such systems we recommend a DataOps solution in which raw

inputs (from user interaction) are fully quarantined from the apps

that handle persistent data storage, using very assertive firewalls

and very low-permission accounts. An auditor must be able to verify

that systems holding private user data are completely inaccessible

from unauthorized locations.

Since the anonymization steps at the application layer may be

intentionally irreversible, anonymized data should be stored with

very high redundancy. It may be impossible to auditably reconstruct

from scratch after identity data has been discarded.

Compliance, regulation, and auditability

For applications subject to extensive outside controls, it’s

important to demonstrate adherence to the spec (application

verification) and to be able to trace concerning behavior back from

the running system to the code and checkins that caused it

(traceability).

For verification, ensure that you have an automated test suite

with organized test case management and that it is automatically

fired up as a part of your CI/CD system. Be sure your full test suite is run before real deployments, not just your quick check test suite (sometimes called the smoke test)

which is automatically run every time a build is done.

For traceability, ensure that your CI/CD system inserts serial

numbers into your distributable software containers and other built

artifacts, and records what artifact version numbers were used

(including source code, libraries, and tools). And require that

checkins to your version control system include links back to the

requirements they were meant to satisfy.

To ensure that what went into production is still

what’s in production, don’t grant permission to apply manual

changes to running servers. Create admin accounts with limited

permissions that can’t be used as “back doors,” and use

immutable servers so that a new deployment is required

when someone wants to change what’s on a server.

Security and endurance against direct attack

If your application is on the public Web, the server cloud

design, the software maintenance schedule, and the network/firewall

design all need to be designed to withstand malicious treatment.

The average IT organization currently spends 12% to 15% of

its budget on security.

DevOps can do much more to defend you than many people realize,

and can make the most of your security budget.

Many security breaches happen through social engineering. Reduce

these opportunities by automating control of your servers under

distinct robot admin accounts, ones that normal users never use.

Keep dangerous permissions away from regular IT staff going about

their days.

Other critical breaches have famously happened through old,

unpatched software with known vulnerabilities. Routinely audit the

versions of operating systems and runtime components that are

installed on all your servers, to ensure you don’t have obsolete

ones in production. This can be largely automated.

Security breaches are worsened by having far too much access

available in a single place, allowing a small intrusion to escalate

into a big one. Take advantage of cloud network configurability ,

separate VMs and containers, and firewalls, to ensure that critical

attack targets (like databases and production servers) are hard to

reach and extra hard to enter as administrator, and to

ensure that critical attack vectors (like front-end servers) are

quarantined, firewalled, and monitored for unauthorized

activity.

How do we get there from here?

Unlike some older technologies, DevOps is not monolithic and can

be implemented in small steps over an arbitrary period of time.

Even the sequence of these steps is flexible. FP Complete

recommends an incremental approach.

If you already have traditional software engineering and

traditional system operations and monitoring in place, focus next

on (A) streamlining your software engineering environment, or (B)

containerizing and automating your deployment systems. Either makes

a great next step and will put you well on the road to complete

DevOps.

As always, remember that FP Complete is available to do a

readiness assessment project for your DevOps, cloud,

and other IT systems engineering. With experience on numerous

advanced IT projects, we’re happy to team up with you with

planning, design, implementation, knowledge transfer, audits, and

upgrades.

For More Information

  • Ten Common Mistakes to Avoid in FinTech Software Development

  • How to Measure the Success of DevOps

  • How do I build my DevOps Team?

fpc-fintech-tofu-banner