Format floats with standard json module

PythonJsonFormattingFloating Point

Python Problem Overview


I am using the standard json module in python 2.6 to serialize a list of floats. However, I'm getting results like this:

>>> import json
>>> json.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'

I want the floats to be formated with only two decimal digits. The output should look like this:

>>> json.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'

I have tried defining my own JSON Encoder class:

class MyEncoder(json.JSONEncoder):
    def encode(self, obj):
        if isinstance(obj, float):
            return format(obj, '.2f')
        return json.JSONEncoder.encode(self, obj)

This works for a sole float object:

>>> json.dumps(23.67, cls=MyEncoder)
'23.67'

But fails for nested objects:

>>> json.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'

I don't want to have external dependencies, so I prefer to stick with the standard json module.

How can I achieve this?

Python Solutions


Solution 1 - Python

Note: This does not work in any recent version of Python.

Unfortunately, I believe you have to do this by monkey-patching (which, to my opinion, indicates a design defect in the standard library json package). E.g., this code:

import json
from json import encoder
encoder.FLOAT_REPR = lambda o: format(o, '.2f')
    
print(json.dumps(23.67))
print(json.dumps([23.67, 23.97, 23.87]))

emits:

23.67
[23.67, 23.97, 23.87]

as you desire. Obviously, there should be an architected way to override FLOAT_REPR so that EVERY representation of a float is under your control if you wish it to be; but unfortunately that's not how the json package was designed:-(.

Solution 2 - Python

import simplejson
    
class PrettyFloat(float):
    def __repr__(self):
        return '%.15g' % self
    
def pretty_floats(obj):
    if isinstance(obj, float):
        return PrettyFloat(obj)
    elif isinstance(obj, dict):
        return dict((k, pretty_floats(v)) for k, v in obj.items())
    elif isinstance(obj, (list, tuple)):
        return list(map(pretty_floats, obj))
    return obj
    
print(simplejson.dumps(pretty_floats([23.67, 23.97, 23.87])))

emits

[23.67, 23.97, 23.87]

No monkeypatching necessary.

Solution 3 - Python

Really unfortunate that dumps doesn't allow you to do anything to floats. However loads does. So if you don't mind the extra CPU load, you could throw it through the encoder/decoder/encoder and get the right result:

>>> json.dumps(json.loads(json.dumps([.333333333333, .432432]), parse_float=lambda x: round(float(x), 3)))
'[0.333, 0.432]'

Solution 4 - Python

If you're using Python 2.7, a simple solution is to simply round your floats explicitly to the desired precision.

>>> sys.version
'2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)]'
>>> json.dumps(1.0/3.0)
'0.3333333333333333'
>>> json.dumps(round(1.0/3.0, 2))
'0.33'

This works because Python 2.7 made float rounding more consistent. Unfortunately this does not work in Python 2.6:

>>> sys.version
'2.6.6 (r266:84292, Dec 27 2010, 00:02:40) \n[GCC 4.4.5]'
>>> json.dumps(round(1.0/3.0, 2))
'0.33000000000000002'

The solutions mentioned above are workarounds for 2.6, but none are entirely adequate. Monkey patching json.encoder.FLOAT_REPR does not work if your Python runtime uses a C version of the JSON module. The PrettyFloat class in Tom Wuttke's answer works, but only if %g encoding works globally for your application. The %.15g is a bit magic, it works because float precision is 17 significant digits and %g does not print trailing zeroes.

I spent some time trying to make a PrettyFloat that allowed customization of precision for each number. Ie, a syntax like

>>> json.dumps(PrettyFloat(1.0 / 3.0, 4))
'0.3333'

It's not easy to get this right. Inheriting from float is awkward. Inheriting from Object and using a JSONEncoder subclass with its own default() method should work, except the json module seems to assume all custom types should be serialized as strings. Ie: you end up with the Javascript string "0.33" in the output, not the number 0.33. There may be a way yet to make this work, but it's harder than it looks.

Solution 5 - Python

Here's a solution that worked for me in Python 3 and does not require monkey patching:

import json

def round_floats(o):
    if isinstance(o, float): return round(o, 2)
    if isinstance(o, dict): return {k: round_floats(v) for k, v in o.items()}
    if isinstance(o, (list, tuple)): return [round_floats(x) for x in o]
    return o


json.dumps(round_floats([23.63437, 23.93437, 23.842347]))

Output is:

[23.63, 23.93, 23.84]

It copies the data but with rounded floats.

Solution 6 - Python

If you're stuck with Python 2.5 or earlier versions: The monkey-patch trick does not seem to work with the original simplejson module if the C speedups are installed:

$ python
Python 2.5.4 (r254:67916, Jan 20 2009, 11:06:13) 
[GCC 4.2.1 (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import simplejson
>>> simplejson.__version__
'2.0.9'
>>> simplejson._speedups
<module 'simplejson._speedups' from '/home/carlos/.python-eggs/simplejson-2.0.9-py2.5-linux-i686.egg-tmp/simplejson/_speedups.so'>
>>> simplejson.encoder.FLOAT_REPR = lambda f: ("%.2f" % f)
>>> simplejson.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'
>>> simplejson.encoder.c_make_encoder = None
>>> simplejson.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'
>>> 

Solution 7 - Python

You can do what you need to do, but it isn't documented:

>>> import json
>>> json.encoder.FLOAT_REPR = lambda f: ("%.2f" % f)
>>> json.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'

Solution 8 - Python

Using numpy

If you actually have really long floats you can round them up/down correctly with numpy:

import json 

import numpy as np

data = np.array([23.671234, 23.97432, 23.870123])

json.dumps(np.around(data, decimals=2).tolist())

'[23.67, 23.97, 23.87]'

Solution 9 - Python

I just released fjson, a small Python library to fix this issue. Install with

pip install fjson

and use just like json, with the addition of the float_format parameter:

import math
import fjson


data = {"a": 1, "b": math.pi}
print(fjson.dumps(data, float_format=".6e", indent=2))
{
  "a": 1,
  "b": 3.141593e+00
}

Solution 10 - Python

If you need to do this in python 2.7 without overriding the global json.encoder.FLOAT_REPR, here's one way.

import json
import math

class MyEncoder(json.JSONEncoder):
    "JSON encoder that renders floats to two decimal places"

    FLOAT_FRMT = '{0:.2f}'

    def floatstr(self, obj):
        return self.FLOAT_FRMT.format(obj)

    def _iterencode(self, obj, markers=None):
        # stl JSON lame override #1
        new_obj = obj
        if isinstance(obj, float):
            if not math.isnan(obj) and not math.isinf(obj):
                new_obj = self.floatstr(obj)
        return super(MyEncoder, self)._iterencode(new_obj, markers=markers)

    def _iterencode_dict(self, dct, markers=None):
        # stl JSON lame override #2
        new_dct = {}
        for key, value in dct.iteritems():
            if isinstance(key, float):
                if not math.isnan(key) and not math.isinf(key):
                    key = self.floatstr(key)
            new_dct[key] = value
        return super(MyEncoder, self)._iterencode_dict(new_dct, markers=markers)

Then, in python 2.7:

>>> from tmp import MyEncoder
>>> enc = MyEncoder()
>>> enc.encode([23.67, 23.98, 23.87])
'[23.67, 23.98, 23.87]'

In python 2.6, it doesn't quite work as Matthew Schinckel points out below:

>>> import MyEncoder
>>> enc = MyEncoder()  
>>> enc.encode([23.67, 23.97, 23.87])
'["23.67", "23.97", "23.87"]'

Solution 11 - Python

Alex Martelli's solution will work for single threaded apps, but may not work for multi-threaded apps that need to control the number of decimal places per thread. Here is a solution that should work in multi threaded apps:

import threading
from json import encoder

def FLOAT_REPR(f):
    """
    Serialize a float to a string, with a given number of digits
    """
    decimal_places = getattr(encoder.thread_local, 'decimal_places', 0)
    format_str = '%%.%df' % decimal_places
    return format_str % f

encoder.thread_local = threading.local()
encoder.FLOAT_REPR = FLOAT_REPR     

#As an example, call like this:
import json

encoder.thread_local.decimal_places = 1
json.dumps([1.56, 1.54]) #Should result in '[1.6, 1.5]'

You can merely set encoder.thread_local.decimal_places to the number of decimal places you want, and the next call to json.dumps() in that thread will use that number of decimal places

Solution 12 - Python

When importing the standard json module, it is enough to change the default encoder FLOAT_REPR. There isn't really the need to import or create Encoder instances.

import json
json.encoder.FLOAT_REPR = lambda o: format(o, '.2f')

json.dumps([23.67, 23.97, 23.87]) #returns  '[23.67, 23.97, 23.87]'

Sometimes is also very useful to output as json the best representation python can guess with str. This will make sure signifficant digits are not ignored.

import json
json.dumps([23.67, 23.9779, 23.87489])
# output is'[23.670000000000002, 23.977900000000002, 23.874890000000001]'

json.encoder.FLOAT_REPR = str
json.dumps([23.67, 23.9779, 23.87489])
# output is '[23.67, 23.9779, 23.87489]'

Solution 13 - Python

I agree with @Nelson that inheriting from float is awkward, but perhaps a solution that only touches the __repr__ function might be forgiveable. I ended up using the decimal package for this to reformat floats when needed. The upside is that this works in all contexts where repr() is being called, so also when simply printing lists to stdout for example. Also, the precision is runtime configurable, after the data has been created. Downside is of course that your data needs to be converted to this special float class (as unfortunately you cannot seem to monkey patch float.__repr__). For that I provide a brief conversion function.

The code:

import decimal
C = decimal.getcontext()

class decimal_formatted_float(float):
   def __repr__(self):
       s = str(C.create_decimal_from_float(self))
       if '.' in s: s = s.rstrip('0')
       return s

def convert_to_dff(elem):
    try:
        return elem.__class__(map(convert_to_dff, elem))
    except:
        if isinstance(elem, float):
            return decimal_formatted_float(elem)
        else:
            return elem

Usage example:

>>> import json
>>> li = [(1.2345,),(7.890123,4.567,890,890.)]
>>>
>>> decimal.getcontext().prec = 15
>>> dff_li = convert_to_dff(li)
>>> dff_li
[(1.2345,), (7.890123, 4.567, 890, 890)]
>>> json.dumps(dff_li)
'[[1.2345], [7.890123, 4.567, 890, 890]]'
>>>
>>> decimal.getcontext().prec = 3
>>> dff_li = convert_to_dff(li)
>>> dff_li
[(1.23,), (7.89, 4.57, 890, 890)]
>>> json.dumps(dff_li)
'[[1.23], [7.89, 4.57, 890, 890]]'

Solution 14 - Python

Pros:

  • Works with any JSON encoder, or even python's repr.
  • Short(ish), seems to work.

Cons:

  • Ugly regexp hack, barely tested.

  • Quadratic complexity.

     def fix_floats(json, decimals=2, quote='"'):
         pattern = r'^((?:(?:"(?:\\.|[^\\"])*?")|[^"])*?)(-?\d+\.\d{'+str(decimals)+'}\d+)'
         pattern = re.sub('"', quote, pattern) 
         fmt = "%%.%df" % decimals
         n = 1
         while n:
             json, n = re.subn(pattern, lambda m: m.group(1)+(fmt % float(m.group(2)).rstrip('0')), json)
         return json
    

Solution 15 - Python

I did that :) Beware that with my code you will always have 2 digit's after comma

>>> json_dumps_with_two_digit_float({'a': 1.0})
'{"a": 1.00}'

My custom function:

from unittest.mock import patch
import json
# We need to ensure that c encoder will not be launched
@patch('json.encoder.c_make_encoder', None)
def json_dumps_with_two_digit_float(some_object):
    # saving original method
    of = json.encoder._make_iterencode
    def inner(*args, **kwargs):
        args = list(args)
        # fifth argument is float formater which will we replace
        args[4] = lambda o: '{:.2f}'.format(o)
        return of(*args, **kwargs)
    
    with patch('json.encoder._make_iterencode', wraps=inner):
        return json.dumps(some_object)

Don't forget to create some tests in your project, because my func heavily related to python json module implementation which can be changed in the future.

Solution 16 - Python

I am amazed / bemused that this is not a feature, fortunately, TensorFlow authors have already solved this problem by using regex:

import json
import re

def FormatFloat(json_str, float_digits):
  pattern = re.compile(r'\d+\.\d+')
  float_repr = '{:.' + '{}'.format(float_digits) + 'f}'

  def MRound(match):
    return float_repr.format(float(match.group()))

  return re.sub(pattern, MRound, json_str)

def Dumps(obj, float_digits=-1, **params):
  """Wrapper of json.dumps that allows specifying the float precision used.

  Args:
    obj: The object to dump.
    float_digits: The number of digits of precision when writing floats out.
    **params: Additional parameters to pass to json.dumps.

  Returns:
    output: JSON string representation of obj.
  """
  json_str = json.dumps(obj, **params)

  if float_digits > -1:
    json_str = FormatFloat(json_str, float_digits)

  return json_str

This works by just wrapping json.dumps from the standard package then running a regex on the result.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionManuel CeronView Question on Stackoverflow
Solution 1 - PythonAlex MartelliView Answer on Stackoverflow
Solution 2 - PythonTom WuttkeView Answer on Stackoverflow
Solution 3 - PythonClaudeView Answer on Stackoverflow
Solution 4 - PythonNelsonView Answer on Stackoverflow
Solution 5 - PythonjcofflandView Answer on Stackoverflow
Solution 6 - PythonCarlos ValienteView Answer on Stackoverflow
Solution 7 - PythonNed BatchelderView Answer on Stackoverflow
Solution 8 - PythonMikhailView Answer on Stackoverflow
Solution 9 - PythonNico SchlömerView Answer on Stackoverflow
Solution 10 - PythonMike FogelView Answer on Stackoverflow
Solution 11 - PythonAnton I. SiposView Answer on Stackoverflow
Solution 12 - PythonF PereiraView Answer on Stackoverflow
Solution 13 - Pythonuser1556435View Answer on Stackoverflow
Solution 14 - PythonSam WatkinsView Answer on Stackoverflow
Solution 15 - PythonIllia SukonnikView Answer on Stackoverflow
Solution 16 - PythonMattView Answer on Stackoverflow