TensorFlow: Input Data Pipelines

#tensorflow #nlp #python #machinelearning

TensorFlow's data pipe is a powerful tool provided by TensorFlow that significantly simplifies the task of constructing an input data pipeline. When integrating our dataset into a model, this dataset pipeline streamlines the process, allowing us to effortlessly feed our dataset into the model for training, ensuring seamless compatibility. This utility also aids in data batching and shuffling with just a single line of code, enhancing the efficiency of data preparation.

Moreover, it grants us the flexibility to adjust our dataset with ease, all without incurring excessive memory overhead. The dataset object is thoughtfully optimized for performance, making it a valuable asset in our machine learning workflow.

This tutorial would be more like a question and answer.

import numpy as np
import tensorflow as tf
import pandas as pd

inputs =[0,1,2,4]
labels = [0,1,0,1]

dataset are created using tf.data.Dataset.from_tensor_slices function .

Guess result for list

dataset =tf.data.Dataset.from_tensor_slices((inputs,labels))
for item in dataset:
  print(item)

Guess result for numpy array

inputs = np.array(inputs)
labels=np.array(labels)

dataset =tf.data.Dataset.from_tensor_slices((inputs,labels))
for item in dataset:
  print(item)

Guess result for dataframe array

df = pd.DataFrame({
    'inputs':[0,1,2,3],
    'lables':[1,0,1,0]
})

df['lables'].values

dataset=tf.data.Dataset.from_tensor_slices((df['inputs'].values,df['lables'].values))

for item in dataset.take(2):
  print(item)

Guess result for csv file

dataset = tf.data.experimental_make.csv_dataset(
    "train.cvs",
    batch_size=16,
    field_delim=",",
    select_colums=["ID","AGE"],
    label_name="Location"
)

for item in dataset.skip(2):
  print(item)

You can post your answers in the comment section.

DEV Community

TensorFlow: Input Data Pipelines

Top comments (0)

Read next

Advent of Code '24 - Day 13 Claw Contraption

A Beginner’s Journey Through the Machine Learning Pipeline (1)

Automating Data Analysis with Python: A Hands-On Guide to My Project

The 7 Best Python Libraries Every Developer Needs to Know