Sep 21, 2016
This is part 2 of a three part series. If you haven't seen it
already, I'd recommend starting with the first
part, which covers communication protocols and streaming of
data. This second part will cover network communication and some
basic concurrency in Haskell.
Simple HTTP client
We saw previously how to send and receive binary data using the
conduit library. We're going to build on this with a conduit-aware
network library. This first example will make a very simplistic,
hard-coded HTTP request and send the entire response from the
server to standard output.
The runTCPClient
creates the actual TCP connection, and provides access to it via the appData
value. This
value allows us to send data to the server (via
appSink
) and get data from the server (via appSource
). We can also get information about the
connection such as the locally used port number, which we're not
using in this example.
We've hard-coded a settings
value that states we should connect to host httpbin.org
* on port 80. We've
also hard-coded an HTTP request body, which is thoroughly
uninteresting.
Once our connection has been established, we send our hard-coded request to the server with yield request $$ appSink appData
. When that's complete, we stream all data from the server to standard output with appSource appData $$ stdoutC
.
The output from this looks very much like you'd expect it to:
* Side note: anyone playing with HTTP client software should definitely check out httpbin.org, it's a great resource.
Upgrading to TLS
On a small tangent, it's trivial to adapt the above program to
work over secure HTTPS instead of plaintext HTTP. All we need to do
is:
Use the
Data.Conduit.Network.TLS
module from the network-conduit-tls librarySwap
runTLSClient
forrunTCPClient
, andtlsClientConfig
forclientSettings
Change port 80 to port 443
The code looks as follows. To convince yourself that this is real: go ahead and run it and see what the url
value in the response body looks like.
Echo server
Let's play with the server side of things. We're going to
implement an echo server, which will receive a chunk of data from
the client and then send it right back.
This listens on port 4200, on all network interfaces ("*"
). We start our server with runTCPServer
, which grabs a listening socket and waits
for connections. For each connection, it forks a new thread, and
runs the provided application. In this case, our application is
trivial: we connect the source to the sink, automatically piping
data from the connection back to itself.
To stress a point above: this is a fully multithreaded server application. You can make multiple telnet
connections
to the server and interact with each of them independently. This is
a lot of bang for very little buck.
For those of you concerned about the inefficiency of forking a
new thread for each incoming connection: Haskell's runtime is built
on top of green threads, making the act of forking very
cheap. There are more details available in a talk I gave on
"Haskell for fast, concurrent, robust services" (video link).
Full duplex
The examples so far have all been half duplex, meaning
they have always been either sending or receiving data. Let's
implement a full duplex application: a simple telnet client
replacement. We need to wait for any input from standard input,
while at the same time waiting for any input from the socket. We're
going to take advantage of Haskell threading to handle this case
too:
The race_
function is a wonderful helper for
concurrency, which says "run these two actions, see which one
finishes first, kill the other one, and ignore any results (the
_
at the end of the name)." It has a sibling function, concurrently
, for running two things until they both
complete. You can implement a surprisingly large number of common
concurrency solutions using just these two functions. For more
information, see the library package tutorial on haskell-lang.org.
You may be terrified of the performance characteristics of this:
we've introduced two blocking threads, when theoretically
callback-based I/O would be far more efficient! Not to worry: in
Haskell, the runtime system uses a fully callback based system
under the surface, using whatever system calls are relevant for
your operating system. When a Haskell green thread makes a
"blocking" I/O call, what actually happens is the runtime puts the
thread to sleep, installs a callback handler to wait for data to be
available, and when the callback is triggered, wakes the green
thread up again.
The details of the Haskell runtime are well described in the paper Mio: A High-Performance Multicore IO Manager for GHC.
Fortunately, for most real world cases, you can write the naive,
easy-to-conceptualize I/O operations based on blocking semantics,
and automatically get the great performance you'd want from
event/callback based system calls.
Client and server in same process
Just to prove that we can: let's throw our client and server
into a single process, using the same concurrency approach we've
had until now.
This isn't a particularly useful application (stdinC $$ stdoutC
would do the same thing without
wasting a network connection), but it does show how easy it is to
combine various pieces of code in Haskell for concurrent
applications.
Next time on Practical Haskell
We've so far figured out how to deal with our simple file
mirror's communication protocol, and how to do network
communication. All that's left is combining these two things
together and wrapping it up with a command line interface. Stay
tuned!