How to create checkpoint of a model in tf

This recipe helps you create checkpoint of a model in tf

Recipe Objective

How to create checkpoint of a model in tf?

This can be achieved, firstly we have to define a model and then we have to set path for the checkpoint and after that we can check that checkpoint through the checkpoint directory. "ls {directory_checkpoint}" will help us to see our checkpoint that we have made.

Complete Guide to Tensorflow for Deep Learning with Python for Free

Step 1 - Import library

import os import tensorflow as tf from tensorflow import keras

Step 2 - Load the Data

(images_data_train, images_train_labels), (images_data_test, images_test_labels) = tf.keras.datasets.mnist.load_data() images_train_labels = images_train_labels[:1000] images_test_labels = images_test_labels[:1000] images_data_train = images_data_train[:1000].reshape(-1, 28 * 28) / 255.0 images_data_test = images_data_test[:1000].reshape(-1, 28 * 28) / 255.0

Step 3 - Define the model

# Define a simple sequential model def Make_model(): My_model = tf.keras.models.Sequential([ keras.layers.Dense(512, activation='relu', input_shape=(784,)), keras.layers.Dropout(0.2), keras.layers.Dense(10) ]) My_model.compile(optimizer='adam', loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=[tf.metrics.SparseCategoricalAccuracy()]) return My_model # Create a basic model instance My_model = Make_model() # Display the model's architecture My_model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_8 (Dense)              (None, 512)               401920    
_________________________________________________________________
dropout_4 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_9 (Dense)              (None, 10)                5130      
=================================================================
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________

Step 4 - Save the Checkpoints

path_checkpoint = "training_1/cp.ckpt" directory_checkpoint = os.path.dirname(path_checkpoint) callback = tf.keras.callbacks.ModelCheckpoint(filepath=path_checkpoint, save_weights_only=True, verbose=1) My_model.fit(images_data_train, images_train_labels, epochs=10, validation_data=(images_data_test, images_test_labels), callbacks=[callback])

Epoch 1/10
32/32 [==============================] - 1s 12ms/step - loss: 1.5851 - sparse_categorical_accuracy: 0.5323 - val_loss: 0.6799 - val_sparse_categorical_accuracy: 0.8050

Epoch 00001: saving model to training_1/cp.ckpt
Epoch 2/10
32/32 [==============================] - 0s 7ms/step - loss: 0.4546 - sparse_categorical_accuracy: 0.8551 - val_loss: 0.5102 - val_sparse_categorical_accuracy: 0.8480

Epoch 00002: saving model to training_1/cp.ckpt
Epoch 3/10
32/32 [==============================] - ETA: 0s - loss: 0.2908 - sparse_categorical_accuracy: 0.9217WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_1
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_2
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.decay
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.learning_rate
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_1
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_2
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.decay
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.learning_rate
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.
32/32 [==============================] - 1s 17ms/step - loss: 0.2907 - sparse_categorical_accuracy: 0.9217 - val_loss: 0.4689 - val_sparse_categorical_accuracy: 0.8560

Epoch 00003: saving model to training_1/cp.ckpt
Epoch 4/10
32/32 [==============================] - 0s 7ms/step - loss: 0.1676 - sparse_categorical_accuracy: 0.9643 - val_loss: 0.4308 - val_sparse_categorical_accuracy: 0.8610

Epoch 00004: saving model to training_1/cp.ckpt
Epoch 5/10
32/32 [==============================] - 0s 8ms/step - loss: 0.1548 - sparse_categorical_accuracy: 0.9681 - val_loss: 0.4265 - val_sparse_categorical_accuracy: 0.8590

Epoch 00005: saving model to training_1/cp.ckpt
Epoch 6/10
32/32 [==============================] - 0s 7ms/step - loss: 0.1380 - sparse_categorical_accuracy: 0.9767 - val_loss: 0.4116 - val_sparse_categorical_accuracy: 0.8620

Epoch 00006: saving model to training_1/cp.ckpt
Epoch 7/10
32/32 [==============================] - 0s 7ms/step - loss: 0.0871 - sparse_categorical_accuracy: 0.9902 - val_loss: 0.3967 - val_sparse_categorical_accuracy: 0.8690

Epoch 00007: saving model to training_1/cp.ckpt
Epoch 8/10
32/32 [==============================] - 0s 7ms/step - loss: 0.0598 - sparse_categorical_accuracy: 0.9938 - val_loss: 0.3946 - val_sparse_categorical_accuracy: 0.8750

Epoch 00008: saving model to training_1/cp.ckpt
Epoch 9/10
32/32 [==============================] - 0s 7ms/step - loss: 0.0431 - sparse_categorical_accuracy: 0.9995 - val_loss: 0.3989 - val_sparse_categorical_accuracy: 0.8730

Epoch 00009: saving model to training_1/cp.ckpt
Epoch 10/10
32/32 [==============================] - 0s 8ms/step - loss: 0.0378 - sparse_categorical_accuracy: 1.0000 - val_loss: 0.4008 - val_sparse_categorical_accuracy: 0.8720

Epoch 00010: saving model to training_1/cp.ckpt

Here we are defining a path to save the checkpoints during training, then we have created a call back which will saves the models weight. After that train the model with new callback. After running this it will generate some warnings related to saving the state of the optimizer, these warnings are in place to discourage outdated usage and can be ignored

Step 5 - Check the Saved checkpoint

ls {directory_checkpoint} checkpoint cp.ckpt.data-00000-of-00001 cp.ckpt.index {"mode":"full","isActive":false}

What Users are saying..

profile image

Ed Godalle

Director Data Analytics at EY / EY Tech
linkedin profile url

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More

Relevant Projects

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Build a Similar Images Finder with Python, Keras, and Tensorflow
Build your own image similarity application using Python to search and find images of products that are similar to any given product. You will implement the K-Nearest Neighbor algorithm to find products with maximum similarity.

Hands-On Approach to Causal Inference in Machine Learning
In this Machine Learning Project, you will learn to implement various causal inference techniques in Python to determine, how effective the sprinkler is in making the grass wet.

Loan Default Prediction Project using Explainable AI ML Models
Loan Default Prediction Project that employs sophisticated machine learning models, such as XGBoost and Random Forest and delves deep into the realm of Explainable AI, ensuring every prediction is transparent and understandable.

Build a Music Recommendation Algorithm using KKBox's Dataset
Music Recommendation Project using Machine Learning - Use the KKBox dataset to predict the chances of a user listening to a song again after their very first noticeable listening event.

Azure Deep Learning-Deploy RNN CNN models for TimeSeries
In this Azure MLOps Project, you will learn to perform docker-based deployment of RNN and CNN Models for Time Series Forecasting on Azure Cloud.

Linear Regression Model Project in Python for Beginners Part 1
Machine Learning Linear Regression Project in Python to build a simple linear regression model and master the fundamentals of regression for beginners.

Linear Regression Model Project in Python for Beginners Part 2
Machine Learning Linear Regression Project for Beginners in Python to Build a Multiple Linear Regression Model on Soccer Player Dataset.

MLOps Project on GCP using Kubeflow for Model Deployment
MLOps using Kubeflow on GCP - Build and deploy a deep learning model on Google Cloud Platform using Kubeflow pipelines in Python

Abstractive Text Summarization using Transformers-BART Model
Deep Learning Project to implement an Abstractive Text Summarizer using Google's Transformers-BART Model to generate news article headlines.