DEV Community

Cover image for Handling text files in python - an easy guide for beginners
Tushar Srivastava
Tushar Srivastava

Posted on • Originally published at tusharsrivastava.hashnode.dev

Handling text files in python - an easy guide for beginners

When working on a large-scale web application or a project which involves working with a large amount of data, it is not logical to store all the data in variables as they are volatile in nature. We need something much more reliable and structured. This is when data files come into play. They provide an easier way to access and manipulate data.

In Python, there are two types of data files:

  1. Text files
  2. Binary files

Text files are regular data files that we all are familiar with. We can open these files in a text editor and read the content inside.

Binary files, on the other hand, encode data in a specific format that can only be understood by a computer or a machine. Most of the files on our computers are stored in binary format.

In this article, I will cover all the basic syntaxes for opening and closing files, and various other syntaxes Python provides to efficiently handle text files.

Opening a file

The most commonly used command while handling data files in Python is open(). It is used to open a file in one of the following modes-

  • r (read mode) - to read the contents of a file
  • w (write mode) - to write to a file. Note that this mode overwrites the previously stored data.
  • a (append mode) - to append to an existing file. This mode writes data at the end of the file and no previously stored data is lost.
  • x (create mode) - to create a new file. This mode returns an error if the file already exists.
  • r+ / +r (read and write mode) - to both read and write data to the same file.
  • a+ / +a (read and append mode) - to both read and append data to the same file.

Syntax for opening a file

fileObject = open(filename, mode)
Enter fullscreen mode Exit fullscreen mode

If you don't specify a mode, Python opens the file in 'r' mode as default.

So, f = open("file.txt") is same as f = open("file.txt", 'r')

Here, 'f' is the file object that contains the contents of the file opened.

Opening files using 'with' clause

Another way of opening files in Python is by using the 'with' clause, which is often considered to be the more efficient way for opening files.

One advantage of using 'with' clause is that any opened file is closed automatically, in case you forget to close it manually.

Syntax

with open(filename, mode) as fileObject:
Enter fullscreen mode Exit fullscreen mode

Example

with open("file.txt", 'r') as myFile:
    for text in myFile:
        print(text)
Enter fullscreen mode Exit fullscreen mode

File Object Attributes

There are some file object attributes in Python that are used to access some more information about the opened file -

  • <file.closed> - returns True if the file is closed and False otherwise
  • <file.name> - returns the name of the opened file
  • <file.mode> - returns the mode in which the file was opened

Reading a file

To read a file in Python, we first need to open the file in r, r+, or a+ mode.

with open("file.txt", 'r') as myFile:
    # more code goes here...
Enter fullscreen mode Exit fullscreen mode

There are three ways to read the contents of a file -

1. The read() method

This method is used to read a specific number of bytes of data from the file.

Syntax

fileObject.read(n)    # 'n' is the no of bytes of data
Enter fullscreen mode Exit fullscreen mode

If 'n' is not specified in the syntax or a negative number is specified, it reads the entire content of the file.

Let's understand this method with an example -

# reading 8 characters from the file
with open("file.txt", 'r') as myFile:
    myFile.read(8)
Enter fullscreen mode Exit fullscreen mode
'Hello wo'
Enter fullscreen mode Exit fullscreen mode
# reading all the content from the file
with open("file.txt", 'r') as myFile:
    myFile.read()
Enter fullscreen mode Exit fullscreen mode
'Hello world! This is content of the file'
Enter fullscreen mode Exit fullscreen mode

2. The readline() method

This method is used to read a single line from the file or a specified number of bytes of data from the first line, but maximum up to the whole line.

Each line ends with a newline character '\n', which is counted as a single character

Syntax

fileObject.readline(n)    # 'n' is the no of bytes of data
Enter fullscreen mode Exit fullscreen mode

If 'n' is not specified in the syntax or a negative number is specified, it reads the entire first line from the file.

Example

# reading 10 characters from the first line
with open("file.txt", 'r') as myFile:
    myFile.readline(10)
Enter fullscreen mode Exit fullscreen mode
'Hello worl'
Enter fullscreen mode Exit fullscreen mode
# reading the entire first line
with open("file.txt", 'r') as myFile:
    myFile.readline()
Enter fullscreen mode Exit fullscreen mode
'Hello world! This is the first line of the file'
Enter fullscreen mode Exit fullscreen mode

3. The readlines() method

This method reads and returns all the lines from a text file, as members of a list. It takes no argument.

Syntax

fileObject.readlines()
Enter fullscreen mode Exit fullscreen mode

Example

with open("file.txt", 'r') as myFile:
    data = myFile.readlines()
    print(data)
Enter fullscreen mode Exit fullscreen mode
['Hello world!\n', 'Hello world!\n', 'Hello world!\n', 'Hello world!\n']
Enter fullscreen mode Exit fullscreen mode

As we can see, each line in the file is returned as a member of the list with a newline character '\n' at the end.

If we want to return each line as a separate list, we can use the splitlines() function.

with open("file.txt", 'r') as myFile:
    lines = myFile.readlines()
    for line in lines:
        line_split = line.splitlines()
        print(line_split)
Enter fullscreen mode Exit fullscreen mode
['Hello World!']
['Hello World!']
['Hello World!']
['Hello World!']
Enter fullscreen mode Exit fullscreen mode

Creating a file

To create a file in Python, we use the open() method and pass the name and mode for the file as arguments.

Syntax

fileObject = open(filename, mode)
Enter fullscreen mode Exit fullscreen mode

When a file is opened in write(w) mode, an empty file is created. If a file with the same name already exists in the system, all the previous data is erased and a new empty file is created.

When opened in append(a) mode, the previous data of the file remains and the new data is written at the end. However, if the file does not exist already, an empty file is created.

Create(x) mode creates a new file with the specified name, but it cannot be read or edited. If a file with the same name already exists, it returns an error.

Writing to a file

For writing to a file, we need to open the file in either 'write' or 'append' mode.

Let's understand the difference between the two -

Write(w) mode opens the file or creates the files if it doesn't exist already, and sets the offset at the beginning of the file, meaning that the data written to this file after opening will overwrite the pre-existing data in the file.

Append(a) mode, on the other hand, sets the offset of the file at its end after opening, which means that the new data is written to the file after the previous data, instead of overwriting it.

After opening the file in either of these modes, there are two methods for writing data to the file -

1. The write() method

This method takes a string as an argument and returns the number of bytes written onto the file.

Numerical values need to be converted into strings before passing as the argument

Syntax

fileObject.write("This is some data")
Enter fullscreen mode Exit fullscreen mode

Example

>>> myFile = open("file.txt", 'w')
>>> myFile.write("Hello World!")
Enter fullscreen mode Exit fullscreen mode
12
Enter fullscreen mode Exit fullscreen mode

2. The writelines() method

This method is used to write multiple lines to a file at the same time. It takes an iterable object like a tuple or a list, containing multiple lines, as the argument.

Syntax

fileObject.writelines(object)
Enter fullscreen mode Exit fullscreen mode

Look at this example for a better understanding -

>>> myFile = open("file.txt", 'w')
>>> lines = ["line1\n", "line2\n", "line3\n"]
>>> myFile.writelines(lines)
Enter fullscreen mode Exit fullscreen mode

Remember to put the newline character(\n) at the end of each line.

After running this code, the file will look like this -

image.png

Setting offsets in a file

When we discussed the differences between 'write' and 'append' modes earlier, I mentioned offsets being set at the beginning or end of a text file.

Put simply, the offset is the position of the cursor from where the data is to be read or written in the file.

All the functions I talked about till now read the file data sequentially from the beginning. If we want to manipulate data in a random manner, Python gives us two functions - seek() and tell()

tell() function

The tell() function returns the current position of the cursor or file handle of the file as an integer. This function takes no argument. When a file is opened in any mode other than 'append' mode, the initial value of tell() function is zero.

Syntax

fileObject.tell()
Enter fullscreen mode Exit fullscreen mode

seek() function

The seek() function allows us to position the file handle at a specific point in the file.

Syntax

fileObject.seek(offset, ref)
Enter fullscreen mode Exit fullscreen mode

The function takes two arguments -

  • offset defines the number of bytes/positions to move forward in the file
  • ref defines the point of reference

Let's understand these two functions with an example -

First, we create a file and write some data.

# Creating and writing data to a file
myFile = open("file.txt", 'w')
myFile.write("Hello world!, this data is being written onto the file.")
myFile.close()
Enter fullscreen mode Exit fullscreen mode

After creating the file, we open it again in 'read' mode and display the position of the file handle before and after reading the file. The offset is set to zero by default.

# reading the file and displaying the offset position before and after reading
myFile = open("file.txt", 'r')
print("default position of the cursor:", myFile.tell())
data = myFile.read()
offset = myFile.tell()
print("current position of the cursor:", offset)
Enter fullscreen mode Exit fullscreen mode

Output:
image.png

We can see, after reading 55 characters from the file, the offset is now set to the 55th position (technically 56th position, as it starts from 0, not 1).

Now, to set the offset at a specific position within the file, we use the seek() function.

# positioning the offset at the 10th position
offset = myFile.seek(10)
print("new position of the cursor", offset)
Enter fullscreen mode Exit fullscreen mode

Output:
image.png

Closing a file

After all the read/write operations are done, it is a good practice to close the file. Sometimes the written data is stored in cached memory and isn't actually written on the file until it is closed. Closing a file makes sure that all the unwritten data is flushed(written) on the file before closing.

The syntax for closing a file in python is

fileObject.close()
Enter fullscreen mode Exit fullscreen mode

Note that when we re-assign a file object to another file, then the previous file is automatically closed.

Also, we discussed earlier that opening a file using the 'with' clause also closes the file automatically and we don't need to close it manually.

Deleting a file

In order to delete a file from the system, we need to import the 'os' python module.

import os
Enter fullscreen mode Exit fullscreen mode

This library has a lot of useful functions, but the one we need here is os.remove(filename) We pass the name of the file as an argument. If the file does not exist, this function returns an error.

A better way to delete a file in python is to check whether the file we want to delete exists. We do this by using os.path.exist(filename)

And the code looks like this -

# Deleting a file
import os

if os.path.exists("file.txt"):
    os.remove("file.txt")
else:
    print("This file does not exist!")
Enter fullscreen mode Exit fullscreen mode

Now that we have covered all the basic concepts for handling text files, it is time for you to practice them yourself and play around with these syntaxes. It might feel a bit overwhelming at first, but it only gets easier with practice and some experience.

Here are a few other resources you can check out -

If you want to add more to this article to make it more informative, feel free to share them.

For any queries, you can connect with me on Twitter @TusharS_23

Hope you found this article helpful. See you in the next one!

Discussion (0)