TensorFlow, why there are 3 files after saving the model?

Tensorflow

Tensorflow Problem Overview


Having read the docs, I saved a model in TensorFlow, here is my demo code:

# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add an op to initialize the variables.
init_op = tf.global_variables_initializer()
 
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
 
# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
  sess.run(init_op)
  # Do some work with the model.
  ..
  # Save the variables to disk.
  save_path = saver.save(sess, "/tmp/model.ckpt")
  print("Model saved in file: %s" % save_path)

but after that, I found there are 3 files

model.ckpt.data-00000-of-00001
model.ckpt.index
model.ckpt.meta

And I can't restore the model by restore the model.ckpt file, since there is no such file. Here is my code

with tf.Session() as sess:
  # Restore variables from disk.
  saver.restore(sess, "/tmp/model.ckpt")

So, why there are 3 files?

Tensorflow Solutions


Solution 1 - Tensorflow

Try this:

with tf.Session() as sess:
    saver = tf.train.import_meta_graph('/tmp/model.ckpt.meta')
    saver.restore(sess, "/tmp/model.ckpt")

The TensorFlow save method saves three kinds of files because it stores the graph structure separately from the variable values. The .meta file describes the saved graph structure, so you need to import it before restoring the checkpoint (otherwise it doesn't know what variables the saved checkpoint values correspond to).

Alternatively, you could do this:

# Recreate the EXACT SAME variables
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")

...

# Now load the checkpoint variable values
with tf.Session() as sess:
    saver = tf.train.Saver()
    saver.restore(sess, "/tmp/model.ckpt")

Even though there is no file named model.ckpt, you still refer to the saved checkpoint by that name when restoring it. From the saver.py source code:

> Users only need to interact with the user-specified prefix... instead > of any physical pathname.

Solution 2 - Tensorflow

  • meta file: describes the saved graph structure, includes GraphDef, SaverDef, and so on; then apply tf.train.import_meta_graph('/tmp/model.ckpt.meta'), will restore Saver and Graph.

  • index file: it is a string-string immutable table(tensorflow::table::Table). Each key is a name of a tensor and its value is a serialized BundleEntryProto. Each BundleEntryProto describes the metadata of a tensor: which of the "data" files contains the content of a tensor, the offset into that file, checksum, some auxiliary data, etc.

  • data file: it is TensorBundle collection, save the values of all variables.

Solution 3 - Tensorflow

I am restoring trained word embeddings from [Word2Vec][1] tensorflow tutorial.

In case you have created multiple checkpoints:

e.g. files created look like this

> model.ckpt-55695.data-00000-of-00001 > > model.ckpt-55695.index > > model.ckpt-55695.meta

try this

def restore_session(self, session):
   saver = tf.train.import_meta_graph('./tmp/model.ckpt-55695.meta')
   saver.restore(session, './tmp/model.ckpt-55695')

when calling restore_session():

def test_word2vec():
   opts = Options()    
   with tf.Graph().as_default(), tf.Session() as session:
       with tf.device("/cpu:0"):            
           model = Word2Vec(opts, session)
           model.restore_session(session)
           model.get_embedding("assistance")

[1]: https://github.com/tensorflow/models/blob/master/tutorials/embedding/word2vec.py "Word2Vec"

Solution 4 - Tensorflow

If you trained a CNN with dropout, for example, you could do this:

def predict(image, model_name):
    """
    image -> single image, (width, height, channels)
    model_name -> model file that was saved without any extensions
    """
    with tf.Session() as sess:
        saver = tf.train.import_meta_graph('./' + model_name + '.meta')
        saver.restore(sess, './' + model_name)
        # Substitute 'logits' with your model
        prediction = tf.argmax(logits, 1)
        # 'x' is what you defined it to be. In my case it is a batch of RGB images, that's why I add the extra dimension
        return prediction.eval(feed_dict={x: image[np.newaxis,:,:,:], keep_prob_dnn: 1.0})

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionGoingMyWayView Question on Stackoverflow
Solution 1 - TensorflowT.K. BartelView Answer on Stackoverflow
Solution 2 - TensorflowGuangcong LiuView Answer on Stackoverflow
Solution 3 - TensorflowSteven WongView Answer on Stackoverflow
Solution 4 - Tensorflowhappy_sisyphusView Answer on Stackoverflow