Data

Data

Data

Jan 10, 2018

Big Data vs Business Intelligence: What’s the difference?

Big Data vs Business Intelligence: What’s the difference?

Big Data vs Business Intelligence: What’s the difference?

Data analytics systems

are the cutting edge of modern corporate computing. While many

people may feel they are behind the “state of the art” they read

about, the truth is these are projects we’re implementing currently for prominent companies in life sciences, finance, healthcare, Internet services, and aerospace, They have a lot in common with each other, and likely even with your computing environment.

That’s the truth on the ground. Meanwhile, we are constantly

seeing buzzwords in the tech media, as writers struggle to help

everyone understand what’s going on out there. Big Data (BD) and

Business Intelligence (BI) get talked about a lot -- but a

lot of people are unclear on what they mean. So let’s cut through

the clutter and look at what these projects are really about: who

are they for, what problems do they solve, what do they have in

common, and how are they different.

What is big data, really?

My friend and Deerfield schoolmate Doug Laney, a distinguished

analyst on Gartner’s data team, famously defined big data as having

volume, velocity, and variety.

Raw data points are available to us all in a volume no one

really anticipated, as nearly every object and action in an

enterprise is tracked. Recently the CTO of a hospital group

described to me their world, in which 22,000 medical devices are

putting out logs of data all the time. The volume of information on

hand is overwhelming. How do we move it around and store it? How

does one make sense of it?

Adding to this is a non-stop stream of new data points, coming

in at high velocity. Our friend and client Tom Doris, founder of

financial analytics firm OTAS and now a LiquidNet executive, says many

stock analysts use their systems to organize the millions of new

data points emerging every day from each of several stock

markets.

This seems hard enough, and then you add in the enormous variety

of data sources relevant to making good decisions. The delightful

Chris Mackey, CEO of our client Mackey RMS, focuses on organizing the

extensive research and collaboration that goes into major decisions

for hedge funds and the like. How do you choose a course of action

when crucial data is in a dozen different formats, on numerous

different servers, behind a range of APIs and addresses?

Big data is the ultimate “be careful what you wish for”

scenario. Do you wish you knew what was going on? Okay -- now what

would you do if you knew almost everything that was going

on? It’s like buying the daily output from a gold mine. No human,

without massive machine assistance, can extract most of the value

from that torrent of gold ore.

Then what’s business intelligence?

The idea of business intelligence predates computers, but has

been made much more important -- and more useful -- by the vast

amount of data we now have. BI is the system of making better

decisions through better decision-support systems.

Business Intelligence.jpg

These systems can be as simple as reporting and charting

software, or as elaborate as machine learning and artificial

intelligence. And they rely on organized streams of input data --

which don’t even have to be “Big” to be extremely useful. In fact,

a lot of BI involves digesting the complexity of the raw data,

bringing it down to human-usable tools like dashboards, metrics,

and exception detection. Many BI systems are hierarchical --

presenting decision-makers with a summary of the current situation,

and features to filter or explore the data to learn more about any

part.

Our client Seattle Cancer Care Alliance, for example, provides life-saving treatments at

several leading cancer-care institutions. From the start, they

provide outstanding care to a great many patients. But wouldn’t it

be even more exciting to constantly learn from the outcomes of all

these treatments, to see which therapies are working best for what

sorts of cases -- and then to use this knowledge to deliver the

best possible course of care for every future patient? While a

typical analysis might only involve thousands of patients in total

-- hardly enough to sound like Big Data -- the caliber of insight

that must be provided is exceptionally high.

For a very different example, consider the project we’re working

on right now with a multi-billion-dollar manufacturing company. As

is typical these days, their big expensive machines have a computer

on board that constantly logs their performance. But a lot of this

data just goes into storage, with no one looking at most of it.

What they want is to understand the leading causes of

breakage and downtime, and gradually eliminate these -- through

offline analysis to discover best practices for maintenance, and

near-real-time analysis to improve operations plans during the

workday -- making their operators into computer-assisted

super-operators. That’s business intelligence, turning available

data into better decision-making.

How can I get both?

As you’ve probably guessed, BD and BI aren’t competing

approaches -- they are IT architectures that play well together,

with Business Intelligence as essentially a layer on top of Big

Data.

We find that most companies already have good IT organizations

in place, with the skills to develop new software when need, and to

integrate existing Commercial Off-the-Shelf (COTS) tools when

available. The problem, then, isn’t lack of building blocks. Anyone

can obtain or write a program to input a table of data and graph

it, or compute subtotals. The problem is how to put these building

blocks together, and especially, how to scale up trivial solutions

to production scale.

We break the BD/BI work into three doable pieces: DevOps, DataOps, and cloud

application architecture.

DevOps is another jargon term in constant use -- including here

at FP Complete. It means the engineering that happens *after*

you’ve written some code but *before* your end user receives the

final results on-screen. Devops is a set of tools and best

practices for scaling up: from a data analysis that runs one time,

on one user’s machine, to a system that runs all the time, on a

reliable and scalable and secure cloud-based system, to support

everyone who needs the answers. If you’re still using manual

processes and mysterious “IT wizards” to scale up your analyses

from the laptop to the data center, you’re not going to reach Big

Data scale or achieve much Business Intelligence. DevOps is a

proven set of techniques and technologies for integration,

deployment, scale-up, and continuous operations.

DataOps is a newer concept -- it’s “DevOps for data.” Just as

numerous tools can clean up and scale up your analytics apps, a

parallel set of tools can clean up and scale up your actual data

feeds. DataOps includes data cleansing, schema enforcement, storage

and replication, warehousing and repositories, metadata management,

version management, uniform API provision, security and monitoring

-- all the tools and processes to turn your “pile” of data into an

“answer factory” capable of responding to any reasonable query, and

constantly ingesting and incorporating the latest data streams.

Cloud application architecture means designing your distributed

system -- servers, apps, tools, work processes, jobs, and data

flows -- into a sensible whole. These days, almost no one should be

designing a major new IT system from scratch. If your company is

mostly writing new virgin software code from a blank-screen start,

you’re wasting work and losing time. Understanding best practices

and existing IT architectures, and picking components from the

existing inventory, will usually get you 80% of the way toward a

good solution. Reuse makes all the difference! Cloud features and

distributed, service-oriented architectures make

building-block-style development productive and fast. Bug-resistant

architectures, with clear separation of responsibilities, will

allow you to break your IT system into pieces -- most not written

from scratch -- each maintainable on its own schedule, and

improvable at will.

What’s realistic to expect?

The good news is that Big Data is not an all-or-nothing

proposition, and neither is Business Intelligence. You can make

stepwise progress on both, which is exactly what we encourage our

clients to do.

Phase 1 will be BI with the limited portion of your data that’s

already in good condition. It’s fairly straightforward to create

new IT solutions -- I don’t say new apps here, because these

solutions will using existing code for much of the work --

that will answer whatever you feel are the most pressing questions

about your data. You are probably already doing some of this,

without even calling it business intelligence. Most companies stay

in Phase 1 for years, never really getting the answers they wish

they had, but at least answering a few crucial questions with

hand-built systems.

Phase 2 will be basic DevOps -- turning your IT work into an IT

factory, in which any analysis that runs for *someone* can be

turned into an analysis that runs for *everyone, all the time* --

maintainably, reproducibly, reliably, scalably , securely. Likely

steps here include Version Control, Continuous Integration,

Continuous Deployment, Automated Testing, Cloud

Scalability, System Monitoring, and possibly Security Auditing.

With many of these things implemented, you will see your BI

productivity go way up, with new solutions coming online regularly

and predictably.

Phase 3 will be basic DataOps, launched when you rapidly

discover that the questions you really want answered require data

that’s “somewhere around here” and not yet organized. You can

expect to do an inventory of the many formal and informal data

feeds you depend on, what format they’re in, how they arrive, how

accurate they are, and how they are accessed. A set of automated

systems will be set up to filter, correct, or “cleanse” these

feeds, and then to make them available on high-powered, typically

cloud-based, distributed data servers. A set of metadata or “tables

of contents” will be set up to help your team locate and tap into

the data sources needed to answer a particular query. Data sources

will likely always be federated, with no one format conquering all,

and with cloud services stitching up the differences. With DataOps

implemented, you can expect to describe any reasonable question

about “what’s really going on,” and if the data is present

somewhere, a system that answers your questions will be

feasible.

The difference between Big Data and Business Intelligence will fade

We find that mastery of data streams is more and more central to

every industry. Whether you’re in financial technology (FinTech),

aerospace, life sciences, or health care, your world is likely to

look more and more like the world of secure Internet services and

cloud computing. People in every industry tell us that this is

where they’re going.

As automation increases, Big Data will become the norm, and

we’ll soon just be calling it Data. Just as DevOps is becoming the

norm for innovative IT groups, so will DataOps. IT departments will

more and more resemble a two-sided “zipper,” marrying

ever-improving data inputs with ever-improving software inputs,

into ever-improving online solutions that run in their data centers

and in the cloud.

It will be a long road, but realistically we can look forward to

a future in which any question you have about your operations, your

customers, your patients, your research, can be answered with real

data -- reliably, reproducibly, and all the time.

If you liked this post you may also like:

fpc-fintech-tofu-banner