In what sense is the IO Monad pure?

HaskellIoMonads

Haskell Problem Overview


I've had the IO monad described to me as a State monad where the state is "the real world". The proponents of this approach to IO argue that this makes IO operations pure, as in referentially transparent. Why is that? From my perspective it appears that code inside the IO monad have plenty of observable side effects. Also, isn't it possible to describe pretty much any non-pure function like a function of the real world? For example, can't we think of, say, C's malloc as being a function that takes a RealWorld and an Int and returns a pointer and a RealWorld, only just like in the IO monad the RealWorld is implicit?

Note: I know what a monad is and how it's used. Please don't respond with a link to a random monad tutorial unless it specifically adresses my question.

Haskell Solutions


Solution 1 - Haskell

I think the best explanation I've heard was actually fairly recently on SO. IO Foo is a recipe for creating a Foo. Another common, more literal, way of saying this is that it is a "program that produces a Foo". It can be executed (many times) to create a Foo or die trying. The execution of the recipe/program is what we ultimately want (otherwise, why write one?), but the thing that is represented by an IO action in our code is the recipe itself.

That recipe is a pure value, in the same exact sense that a String is a pure value. Recipes can be combined and manipulated in interesting, sometimes astonishing, ways, but the many ways these recipes can be combined (except for the blatantly non-pure unsafePerformIO, unsafeCoerce, etc.) are all completely referentially transparent, deterministic, and all that nice stuff. The resulting recipe depends in absolutely no way whatsoever on the state of anything other than the recipes that it was built up from.

Solution 2 - Haskell

> Also, isn't it possible to describe pretty much any non-pure function like a function of the real world? For example, can't we think of, say, C's malloc as being a function that takes a RealWorld and an Int and returns a pointer and a RealWorld, only just like in the IO monad the RealWorld is implicit?

For sure ...

The whole idea of functional programming is to describe programs as a combination of small, independent calculations building up bigger computations.

Having these independent calculations, you'll have lots of benefits, reaching from concise programs to efficient and efficiently parallelizable codes, laziness up to the the rigorous guarantee that control flows as intended - with no chance of interference or corruption of arbitrary data.

Now - in some cases (like IO), we need impure code. Calculations involving such operations cannot be independent - they could mutate arbitrary data of another computation.

The point is - Haskell is always pure, IO doesn't change this.

So, our impure, non-independent codes have to get a common dependency - we have to pass a RealWorld. So whatever stateful computation we want to run, we have to pass this RealWorld thing to apply our changes to - and whatever other stateful computation wants to see or make changes has to know the RealWorld too.

Whether this is done explicitly or implicitly through the IO monad is irrelevant. You build up a Haskell program as a giant computation that transforms data, and one part of this data is the RealWorld.

Once the initial main :: IO () gets called when your program is run with the current real world as a parameter, this real world gets carried through all impure calculations involved, just as data would in a State. That's what monadic >>= (bind) takes care of.

And where the RealWorld doesn't get (as in pure computations or without any >>=-ing to main), there is no chance of doing anything with it. And where it does get, that happened by purely functional passing of an (implicit) parameter. That's why

let foo = putStrLn "AAARGH" in 42

does absolutely nothing - and why the IO monad - like anything else - is pure. What happens inside this code can of course be impure, but it's all caught inside, with no chance of interfering with non-connected computations.

Solution 3 - Haskell

Suppose we have something like:

animatePowBoomWhenHearNoiseInMicrophone :: TimeDiff -> Sample -> IO ()
animatePowBoomWhenHearNoiseInMicrophone
    levelWeightedAverageHalfLife levelThreshord = ...

programA :: IO ()
programA = animatePowBoomWhenHearNoiseInMicrophone 3 10000

programB :: IO ()
programB = animatePowBoomWhenHearNoiseInMicrophone 3 10000

Here's a point of view:

animatePowBoomWhenHearNoiseInMicrophone is a pure function in the sense that its results for same input, programA and programB, are exactly the same. You can do main = programA or main = programB and it would be exactly the same.

animatePowBoomWhenHearNoiseInMicrophone is a function receiving two arguments and resulting in a description of a program. The Haskell runtime can execute this description if you set main to it or otherwise include it in main via binding.

What is IO? IO is a DSL for describing imperative programs, encoded in "pure-haskell" data structures and functions.

"complete-haskell" aka GHC is an implementation of both "pure-haskell", and an imperative implementation of an IO decoder/executer.

Solution 4 - Haskell

It quite simply comes down to extensional equality:

If you were to call getLine twice, then both calls would return an IO String which would look exactly the same on the outside each time. If you were to write a function to take 2 IO Strings and return a Bool to signal a detected difference between them both, it would not be possible to detect any difference from any observable properties. It could not ask any other function whether they are equal and any attempt at using >>= must also return something in IO which all are equall externally.

Solution 5 - Haskell

I'll let Martin Odersky answer this

>The IO monad does not make a function pure. It just makes it obvious that it's impure.

Sounds clear enough.

Solution 6 - Haskell

Even though its title is a bit weird (in that it doesn't precisely match the content) the following haskell-cafe thread contains a nice discussion about different IO models for Haskell.

http://www.mail-archive.com/[email protected]/msg79613.html

Solution 7 - Haskell

Well, this is what we have been taught at college -

Function is referentially transparent when it always returns the same value for specified input (or the same expression always evaluates to same value in the same context). Therefore, for example getChar would not be referentially transparent if it had type signature just () -> Char or Char, because you can get different results if you call this function multiple times with the same argument.

But, if you introduce IO monad, then getChar can have type IO Char and this type has only one single value - IO Char. So getChar allways reutrns the same value, no matter on which key user really pressed.

But you are still able to "get" the underlying value from this IO Char thing. Well, not really get, but pass to another function using bind operator (>>=), so you can work with the Char that user entered further in your program.

Solution 8 - Haskell

Philip Wadler writes:

> In an impure language, an operation like tick would be represented by a function of type
() -> (). The spurious argument () is required to delay the effect until the function is applied, and since the output type is () one may guess that the function's purpose lies in a side effect. In contrast, here tick has type M (): no spurious argument is needed, and the appearance of M explicitly indicates what sort of effect may occur.

I fail to understand how M () makes the empty argument list () less spurious but Wadler is pretty clear that monads just indicate a kind of side-effect, they do not eliminate it.

Solution 9 - Haskell

> In what sense is the monadic IO type pure?

In the sense that values of the IO type are portions of Standard ML abstract imperative code which ideally can only be processed by the RTS of a Haskell implementation - in How to Declare an Imperative, Philip Wadler provides a hint as to how this is possible:

(* page 26 *)
type 'a io   = unit -> 'a

infix >>=
val >>=      : 'a io * ('a -> 'b io) -> 'b io
fun m >>= k  = fn () => let
                          val x = m ()
                          val y = k x ()
                        in
                          y
                        end

val return   : 'a -> 'a io
fun return x = fn () => x

(* page 27 *)
val execute : unit io -> unit
fun execute m = m ()

However, not everyone finds this situation acceptable:

> [...] a state-less model of computation on top of a machinery whose most eminent characteristic is state [means] the gap between model and machinery is wide, and therefore costly to bridge. [...]
This has in due time also been recognized by the protagonists of functional languages. They have introduced state (and variables) in various tricky ways. The purely functional character has thereby been compromised and sacrificed. [...]
> > Niklaus Wirth.

...anyone for Miranda(R)?


> I've had the IO monad described to me as a State monad where the state is "the real world".

That would be the classic pass-the-planet model of I/O, which Clean uses directly:

import StdFile
import StdMisc
import StdString

Start :: *World -> *World
Start w = putString "Hello, world!\n" w

putString :: String *World -> *World
putString str world
    # (out, world1) = stdio world
    # out1          = fwrites str out
    # (_, world2)   = fclose out1 world1
    = world2

putChar :: Char *World -> *World
putChar c w = putString {c} w

> The proponents of this approach to I/O argue that this makes I/O operations pure, as in referentially transparent. Why is that?

Because it's usually correct.

From the standard Haskell 2010 library module Data.List:

mapAccumL _ s []        =  (s, [])
mapAccumL f s (x:xs)    =  (s'',y:ys)
                           where (s', y ) = f s x
                                 (s'',ys) = mapAccumL f s' xs

If this idiom is so common that it has specific definitions to support it, then its use as a model of I/O (with a suitable state-type) is really no great surprise - from pages 14-15 of State in Haskell by John Launchbury and Simon Peyton Jones:

> How, then, are I/O operations executed at all? The meaning of the whole program > is given by the value of the top-level identifier mainIO: > > mainIO :: IO () > > mainIO is an I/O state transformer, which is applied to the external world state by > the operating system. Semantically speaking, it returns a new world state, and the > changes embodied therein are applied to the real world.

(...back when main was called mainIO.)

The most recent Clean Language Report (I/O on the Unique World on page 24 of 148) goes into more detail:

> The world which is given to the initial expression is an abstract data structure, an abstract world of type *World which > models the concrete physical world as seen from the program. The abstract world can in principle contain > anything what a functional program needs to interact during execution with the concrete world. The world can be seen as a > state and modifications of the world can be realized via state transition functions defined on the world or a part of the world. By > requiring that these state transition functions work on a unique world the modifications of the abstract world can directly be > realized in the real physical world, without loss of efficiency and without losing referential transparency.

In terms of semantics, the crucial point is this: for an I/O-centric program's changes to take effect, that program must return the final world-state value.

Now consider this small Clean program:

Start :: *World -> *World
Start w = loopX w

loopX :: *World -> *World
loopX w
    # w1 = putChar 'x' w
    = loopX w1

Obviously the final World value is never returned so 'x' should not be seen at all...

> Also, isn't it possible to describe pretty much any non-pure function like a function of the real world?

Yes; that's more-or-less how the FFI works in Haskell 2010.


> From my perspective it appears that code inside the monadic IO type have plenty of observable side effects.

If you're using GHC, it isn't an appearance - from A History of Haskell (page 26 of 55) by Paul Hudak, John Hughes, Simon Peyton Jones, and Philip Wadler:

> Of course, GHC does not actually pass the world around; instead, it passes a dummy “token,” to ensure proper sequencing of actions in the presence of lazy evaluation, and performs input and output as actual side effects!

But that's merely an implementation detail:

> An IO computation is a function that (logically) takes the state of the world, and returns a modified world as well as the return value.

> Logic doesn't apply to the real world. > > Marvin Lee Minsky.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionkeiterView Question on Stackoverflow
Solution 1 - HaskellmokusView Answer on Stackoverflow
Solution 2 - HaskellDarioView Answer on Stackoverflow
Solution 3 - HaskellyairchuView Answer on Stackoverflow
Solution 4 - HaskellTim MatthewsView Answer on Stackoverflow
Solution 5 - HaskellTomer Ben DavidView Answer on Stackoverflow
Solution 6 - HaskellOzgurView Answer on Stackoverflow
Solution 7 - HaskellMartin JonášView Answer on Stackoverflow
Solution 8 - HaskellValView Answer on Stackoverflow
Solution 9 - HaskellatraversView Answer on Stackoverflow