Lookout man Now This tutorial has a related video course created by the Existent Python team. Watch it together with the written tutorial to deepen your agreement: Reading and Writing Files in Python

1 of the almost common tasks that you tin can do with Python is reading and writing files. Whether information technology'southward writing to a simple text file, reading a complicated server log, or even analyzing raw byte data, all of these situations require reading or writing a file.

In this tutorial, you'll learn:

  • What makes up a file and why that's important in Python
  • The basics of reading and writing files in Python
  • Some basic scenarios of reading and writing files

This tutorial is mainly for beginner to intermediate Pythonistas, but there are some tips in here that more advanced programmers may capeesh every bit well.

What Is a File?

Earlier we can go into how to work with files in Python, information technology's of import to understand what exactly a file is and how modern operating systems handle some of their aspects.

At its cadre, a file is a contiguous fix of bytes used to shop data. This data is organized in a specific format and can be annihilation as simple equally a text file or as complicated as a program executable. In the finish, these byte files are then translated into binary 1 and 0 for easier processing by the computer.

Files on nigh modern file systems are composed of 3 principal parts:

  1. Header: metadata about the contents of the file (file name, size, blazon, and so on)
  2. Information: contents of the file as written past the creator or editor
  3. End of file (EOF): special graphic symbol that indicates the end of the file
The file format with the header on top, data contents in the middle and the footer on the bottom.

What this data represents depends on the format specification used, which is typically represented past an extension. For instance, a file that has an extension of .gif almost likely conforms to the Graphics Interchange Format specification. In that location are hundreds, if not thousands, of file extensions out there. For this tutorial, you'll only bargain with .txt or .csv file extensions.

File Paths

When you admission a file on an operating organisation, a file path is required. The file path is a string that represents the location of a file. Information technology's broken up into iii major parts:

  1. Folder Path: the file folder location on the file organization where subsequent folders are separated past a forwards slash / (Unix) or backslash \ (Windows)
  2. File Proper noun: the actual name of the file
  3. Extension: the cease of the file path pre-pended with a period (.) used to betoken the file type

Hither'due south a quick example. Let's say yous take a file located within a file structure like this:

                                / │ ├── path/ |   │ │   ├── to/ │   │   └── cats.gif │   │ │   └── dog_breeds.txt | └── animals.csv                              

Permit'southward say you wanted to access the cats.gif file, and your electric current location was in the same folder as path. In order to access the file, you need to go through the path folder and then the to folder, finally arriving at the cats.gif file. The Folder Path is path/to/. The File Proper name is cats. The File Extension is .gif. So the full path is path/to/cats.gif.

Now let's say that your current location or current working directory (cwd) is in the to folder of our instance folder construction. Instead of referring to the cats.gif by the full path of path/to/cats.gif, the file can be simply referenced past the file proper noun and extension cats.gif.

                                / │ ├── path/ |   │ |   ├── to/  ← Your current working directory (cwd) is hither |   │   └── cats.gif  ← Accessing this file |   │ |   └── dog_breeds.txt | └── animals.csv                              

But what nigh dog_breeds.txt? How would y'all admission that without using the full path? You can employ the special characters double-dot (..) to move one directory up. This means that ../dog_breeds.txt will reference the dog_breeds.txt file from the directory of to:

                                / │ ├── path/  ← Referencing this parent folder |   │ |   ├── to/  ← Electric current working directory (cwd) |   │   └── cats.gif |   │ |   └── dog_breeds.txt  ← Accessing this file | └── animals.csv                              

The double-dot (..) can be chained together to traverse multiple directories above the current directory. For example, to access animals.csv from the to folder, you would utilise ../../animals.csv.

Line Endings

One problem often encountered when working with file information is the representation of a new line or line ending. The line ending has its roots from back in the Morse Code era, when a specific pro-sign was used to communicate the stop of a transmission or the terminate of a line.

Later, this was standardized for teleprinters by both the International Organization for Standardization (ISO) and the American Standards Association (ASA). ASA standard states that line endings should use the sequence of the Carriage Render (CR or \r) and the Line Feed (LF or \due north) characters (CR+LF or \r\northward). The ISO standard still allowed for either the CR+LF characters or simply the LF character.

Windows uses the CR+LF characters to indicate a new line, while Unix and the newer Mac versions use just the LF graphic symbol. This tin cause some complications when yous're processing files on an operating system that is unlike than the file'due south source. Here'south a quick example. Let'southward say that we examine the file dog_breeds.txt that was created on a Windows system:

                                Pug\r\due north Jack Russell Terrier\r\n English Springer Spaniel\r\n German language Shepherd\r\north Staffordshire Bull Terrier\r\n Cavalier King Charles Spaniel\r\due north Golden Retriever\r\northward W Highland White Terrier\r\n Boxer\r\n Border Terrier\r\n                              

This aforementioned output will exist interpreted on a Unix device differently:

                                Pug\r \due north Jack Russell Terrier\r \n English Springer Spaniel\r \n German Shepherd\r \n Staffordshire Bull Terrier\r \due north Cavalier King Charles Spaniel\r \due north Golden Retriever\r \n West Highland White Terrier\r \northward Boxer\r \n Border Terrier\r \n                              

This can make iterating over each line problematic, and you may need to account for situations similar this.

Character Encodings

Another mutual trouble that you may confront is the encoding of the byte data. An encoding is a translation from byte data to human readable characters. This is typically washed by assigning a numerical value to represent a graphic symbol. The two nearly common encodings are the ASCII and UNICODE Formats. ASCII can only store 128 characters, while Unicode can incorporate up to 1,114,112 characters.

ASCII is actually a subset of Unicode (UTF-8), meaning that ASCII and Unicode share the same numerical to character values. It'southward important to annotation that parsing a file with the incorrect graphic symbol encoding can pb to failures or misrepresentation of the character. For example, if a file was created using the UTF-8 encoding, and you lot endeavor to parse it using the ASCII encoding, if there is a character that is outside of those 128 values, then an error will exist thrown.

Opening and Closing a File in Python

When y'all want to work with a file, the first affair to do is to open up information technology. This is done by invoking the open() congenital-in function. open() has a unmarried required argument that is the path to the file. open() has a single return, the file object:

                                            file                =                open                (                'dog_breeds.txt'                )                          

After y'all open up a file, the next thing to learn is how to close it.

It'southward of import to call up that it'due south your responsibility to close the file. In most cases, upon termination of an application or script, a file will be airtight eventually. However, there is no guarantee when exactly that will happen. This can lead to unwanted behavior including resource leaks. Information technology's also a best practice inside Python (Pythonic) to make certain that your code behaves in a way that is well divers and reduces any unwanted beliefs.

When yous're manipulating a file, in that location are two means that you lot can use to ensure that a file is closed properly, even when encountering an fault. The first manner to close a file is to apply the try-finally block:

                                            reader                =                open up                (                'dog_breeds.txt'                )                try                :                # Farther file processing goes hither                finally                :                reader                .                shut                ()                          

If you're unfamiliar with what the attempt-finally cake is, check out Python Exceptions: An Introduction.

The second manner to shut a file is to apply the with statement:

                                            with                open                (                'dog_breeds.txt'                )                as                reader                :                # Farther file processing goes here                          

The with statement automatically takes care of closing the file one time it leaves the with block, even in cases of error. I highly recommend that you lot utilise the with statement as much as possible, as it allows for cleaner code and makes handling whatsoever unexpected errors easier for y'all.

Most likely, you lot'll also want to use the second positional argument, fashion. This argument is a string that contains multiple characters to represent how y'all want to open the file. The default and near common is 'r', which represents opening the file in read-just mode as a text file:

                                            with                open                (                'dog_breeds.txt'                ,                'r'                )                as                reader                :                # Further file processing goes here                          

Other options for modes are fully documented online, but the most commonly used ones are the following:

Character Meaning
'r' Open for reading (default)
'w' Open for writing, truncating (overwriting) the file first
'rb' or 'wb' Open up in binary fashion (read/write using byte information)

Let'due south get back and talk a little about file objects. A file object is:

"an object exposing a file-oriented API (with methods such every bit read() or write()) to an underlying resource." (Source)

There are 3 different categories of file objects:

  • Text files
  • Buffered binary files
  • Raw binary files

Each of these file types are defined in the io module. Hither'southward a quick rundown of how everything lines up.

Text File Types

A text file is the about common file that you'll meet. Here are some examples of how these files are opened:

                                                  open                  (                  'abc.txt'                  )                  open up                  (                  'abc.txt'                  ,                  'r'                  )                  open                  (                  'abc.txt'                  ,                  'w'                  )                              

With these types of files, open() will return a TextIOWrapper file object:

>>>

                                                  >>>                                    file                  =                  open                  (                  'dog_breeds.txt'                  )                  >>>                                    type                  (                  file                  )                  <class '_io.TextIOWrapper'>                              

This is the default file object returned by open().

Buffered Binary File Types

A buffered binary file blazon is used for reading and writing binary files. Here are some examples of how these files are opened:

                                                  open                  (                  'abc.txt'                  ,                  'rb'                  )                  open                  (                  'abc.txt'                  ,                  'wb'                  )                              

With these types of files, open() will return either a BufferedReader or BufferedWriter file object:

>>>

                                                  >>>                                    file                  =                  open                  (                  'dog_breeds.txt'                  ,                  'rb'                  )                  >>>                                    type                  (                  file                  )                  <class '_io.BufferedReader'>                  >>>                                    file                  =                  open                  (                  'dog_breeds.txt'                  ,                  'wb'                  )                  >>>                                    type                  (                  file                  )                  <class '_io.BufferedWriter'>                              

Raw File Types

A raw file type is:

"generally used as a low-level building-block for binary and text streams." (Source)

It is therefore not typically used.

Hither's an example of how these files are opened:

                                                  open                  (                  'abc.txt'                  ,                  'rb'                  ,                  buffering                  =                  0                  )                              

With these types of files, open() will return a FileIO file object:

>>>

                                                  >>>                                    file                  =                  open                  (                  'dog_breeds.txt'                  ,                  'rb'                  ,                  buffering                  =                  0                  )                  >>>                                    blazon                  (                  file                  )                  <class '_io.FileIO'>                              

Reading and Writing Opened Files

Once you've opened up a file, you'll want to read or write to the file. Showtime off, let's cover reading a file. There are multiple methods that tin be chosen on a file object to help you out:

Method What It Does
.read(size=-i) This reads from the file based on the number of size bytes. If no argument is passed or None or -1 is passed, so the entire file is read.
.readline(size=-ane) This reads at most size number of characters from the line. This continues to the finish of the line and then wraps back effectually. If no statement is passed or None or -ane is passed, so the entire line (or rest of the line) is read.
.readlines() This reads the remaining lines from the file object and returns them as a listing.

Using the same dog_breeds.txt file you used above, let's go through some examples of how to use these methods. Here's an example of how to open and read the unabridged file using .read():

>>>

                                            >>>                                with                open up                (                'dog_breeds.txt'                ,                'r'                )                as                reader                :                >>>                                # Read & impress the unabridged file                >>>                                print                (                reader                .                read                ())                Pug                Jack Russell Terrier                English language Springer Spaniel                High german Shepherd                Staffordshire Bull Terrier                Cavalier King Charles Spaniel                Golden Retriever                Westward Highland White Terrier                Boxer                Border Terrier                          

Here'due south an example of how to read 5 bytes of a line each time using the Python .readline() method:

>>>

                                            >>>                                with                open                (                'dog_breeds.txt'                ,                'r'                )                as                reader                :                >>>                                # Read & print the first five characters of the line 5 times                >>>                                impress                (                reader                .                readline                (                five                ))                >>>                                # Notice that line is greater than the 5 chars and continues                >>>                                # down the line, reading 5 chars each time until the end of the                >>>                                # line and and then "wraps" around                >>>                                print                (                reader                .                readline                (                5                ))                >>>                                print                (                reader                .                readline                (                five                ))                >>>                                print                (                reader                .                readline                (                5                ))                >>>                                print                (                reader                .                readline                (                v                ))                Pug                Jack                Russe                ll Te                rrier                          

Here's an case of how to read the entire file every bit a list using the Python .readlines() method:

>>>

                                            >>>                                f                =                open up                (                'dog_breeds.txt'                )                >>>                                f                .                readlines                ()                # Returns a list object                ['Pug\north', 'Jack Russell Terrier\northward', 'English Springer Spaniel\n', 'German Shepherd\northward', 'Staffordshire Bull Terrier\n', 'Cavalier Male monarch Charles Spaniel\n', 'Golden Retriever\n', 'West Highland White Terrier\n', 'Boxer\northward', 'Border Terrier\n']                          

The higher up instance can also be done by using listing() to create a list out of the file object:

>>>

                                            >>>                                f                =                open                (                'dog_breeds.txt'                )                >>>                                list                (                f                )                ['Pug\n', 'Jack Russell Terrier\n', 'English Springer Spaniel\n', 'German language Shepherd\n', 'Staffordshire Bull Terrier\due north', 'Cavalier King Charles Spaniel\n', 'Gold Retriever\north', 'West Highland White Terrier\northward', 'Boxer\north', 'Border Terrier\n']                          

Iterating Over Each Line in the File

A common thing to do while reading a file is to iterate over each line. Hither's an example of how to use the Python .readline() method to perform that iteration:

>>>

                                                  >>>                                    with                  open up                  (                  'dog_breeds.txt'                  ,                  'r'                  )                  as                  reader                  :                  >>>                                    # Read and print the entire file line by line                  >>>                                    line                  =                  reader                  .                  readline                  ()                  >>>                                    while                  line                  !=                  ''                  :                  # The EOF char is an empty string                  >>>                                    print                  (                  line                  ,                  stop                  =                  ''                  )                  >>>                                    line                  =                  reader                  .                  readline                  ()                  Pug                  Jack Russell Terrier                  English language Springer Spaniel                  German language Shepherd                  Staffordshire Balderdash Terrier                  Condescending King Charles Spaniel                  Golden Retriever                  Due west Highland White Terrier                  Boxer                  Border Terrier                              

Another way you could iterate over each line in the file is to utilise the Python .readlines() method of the file object. Remember, .readlines() returns a list where each element in the list represents a line in the file:

>>>

                                                  >>>                                    with                  open                  (                  'dog_breeds.txt'                  ,                  'r'                  )                  as                  reader                  :                  >>>                                    for                  line                  in                  reader                  .                  readlines                  ():                  >>>                                    print                  (                  line                  ,                  end                  =                  ''                  )                  Pug                  Jack Russell Terrier                  English language Springer Spaniel                  German Shepherd                  Staffordshire Bull Terrier                  Condescending King Charles Spaniel                  Golden Retriever                  Westward Highland White Terrier                  Boxer                  Border Terrier                              

Still, the above examples can exist farther simplified by iterating over the file object itself:

>>>

                                                  >>>                                    with                  open                  (                  'dog_breeds.txt'                  ,                  'r'                  )                  as                  reader                  :                  >>>                                    # Read and impress the entire file line by line                  >>>                                    for                  line                  in                  reader                  :                  >>>                                    print                  (                  line                  ,                  end                  =                  ''                  )                  Pug                  Jack Russell Terrier                  English Springer Spaniel                  High german Shepherd                  Staffordshire Bull Terrier                  Condescending King Charles Spaniel                  Gilt Retriever                  W Highland White Terrier                  Boxer                  Edge Terrier                              

This final arroyo is more Pythonic and tin can be quicker and more retention efficient. Therefore, it is suggested y'all use this instead.

Now let's dive into writing files. Equally with reading files, file objects have multiple methods that are useful for writing to a file:

Method What It Does
.write(string) This writes the string to the file.
.writelines(seq) This writes the sequence to the file. No line endings are appended to each sequence detail. It'south up to you to add the advisable line ending(s).

Here'southward a quick example of using .write() and .writelines():

                                                  with                  open up                  (                  'dog_breeds.txt'                  ,                  'r'                  )                  as                  reader                  :                  # Note: readlines doesn't trim the line endings                  dog_breeds                  =                  reader                  .                  readlines                  ()                  with                  open                  (                  'dog_breeds_reversed.txt'                  ,                  'w'                  )                  as                  writer                  :                  # Alternatively you could employ                  # author.writelines(reversed(dog_breeds))                  # Write the domestic dog breeds to the file in reversed gild                  for                  breed                  in                  reversed                  (                  dog_breeds                  ):                  author                  .                  write                  (                  breed                  )                              

Working With Bytes

Sometimes, you lot may demand to work with files using byte strings. This is washed by adding the 'b' graphic symbol to the mode argument. All of the same methods for the file object use. Still, each of the methods wait and return a bytes object instead:

>>>

                                                  >>>                                    with                  open                  (                  'dog_breeds.txt'                  ,                  'rb'                  )                  as                  reader                  :                  >>>                                    print                  (                  reader                  .                  readline                  ())                  b'Pug\n'                              

Opening a text file using the b flag isn't that interesting. Let's say we have this cute motion-picture show of a Jack Russell Terrier (jack_russell.png):

A cute picture of a Jack Russell Terrier
Epitome: CC By iii.0 (https://creativecommons.org/licenses/by/3.0)], from Wikimedia Commons

You can actually open up that file in Python and examine the contents! Since the .png file format is well defined, the header of the file is eight bytes broken up like this:

Value Interpretation
0x89 A "magic" number to betoken that this is the start of a PNG
0x50 0x4E 0x47 PNG in ASCII
0x0D 0x0A A DOS style line ending \r\northward
0x1A A DOS fashion EOF character
0x0A A Unix style line ending \northward

Certain enough, when you open the file and read these bytes individually, you lot tin see that this is indeed a .png header file:

>>>

                                                  >>>                                    with                  open                  (                  'jack_russell.png'                  ,                  'rb'                  )                  as                  byte_reader                  :                  >>>                                    print                  (                  byte_reader                  .                  read                  (                  1                  ))                  >>>                                    print                  (                  byte_reader                  .                  read                  (                  3                  ))                  >>>                                    print                  (                  byte_reader                  .                  read                  (                  2                  ))                  >>>                                    print                  (                  byte_reader                  .                  read                  (                  ane                  ))                  >>>                                    print                  (                  byte_reader                  .                  read                  (                  1                  ))                  b'\x89'                  b'PNG'                  b'\r\north'                  b'\x1a'                  b'\n'                              

A Full Example: dos2unix.py

Let's bring this whole affair habitation and look at a full example of how to read and write to a file. The post-obit is a dos2unix similar tool that will convert a file that contains line endings of \r\n to \n.

This tool is cleaved up into three major sections. The start is str2unix(), which converts a cord from \r\n line endings to \n. The second is dos2unix(), which converts a string that contains \r\n characters into \n. dos2unix() calls str2unix() internally. Finally, in that location's the __main__ cake, which is chosen only when the file is executed equally a script. Think of information technology as the main function found in other programming languages.

                                                  """                  A simple script and library to convert files or strings from dos like                  line endings with Unix like line endings.                  """                  import                  argparse                  import                  os                  def                  str2unix                  (                  input_str                  :                  str                  )                  ->                  str                  :                  r                  """                                      Converts the string from \r\n line endings to \north                                      Parameters                                      ----------                                      input_str                                      The string whose line endings will exist converted                                      Returns                                      -------                                      The converted cord                                      """                  r_str                  =                  input_str                  .                  replace                  (                  '                  \r\n                  '                  ,                  '                  \n                  '                  )                  return                  r_str                  def                  dos2unix                  (                  source_file                  :                  str                  ,                  dest_file                  :                  str                  ):                  """                                      Converts a file that contains Dos like line endings into Unix like                                      Parameters                                      ----------                                      source_file                                      The path to the source file to be converted                                      dest_file                                      The path to the converted file for output                                      """                  # Notation: Could add file existence checking and file overwriting                  # protection                  with                  open                  (                  source_file                  ,                  'r'                  )                  every bit                  reader                  :                  dos_content                  =                  reader                  .                  read                  ()                  unix_content                  =                  str2unix                  (                  dos_content                  )                  with                  open                  (                  dest_file                  ,                  'w'                  )                  as                  writer                  :                  writer                  .                  write                  (                  unix_content                  )                  if                  __name__                  ==                  "__main__"                  :                  # Create our Argument parser and fix its description                  parser                  =                  argparse                  .                  ArgumentParser                  (                  description                  =                  "Script that converts a DOS like file to an Unix similar file"                  ,                  )                  # Add together the arguments:                  #   - source_file: the source file nosotros want to catechumen                  #   - dest_file: the destination where the output should go                  # Annotation: the use of the argument type of argparse.FileType could                  # streamline some things                  parser                  .                  add_argument                  (                  'source_file'                  ,                  help                  =                  'The location of the source '                  )                  parser                  .                  add_argument                  (                  '--dest_file'                  ,                  help                  =                  'Location of dest file (default: source_file appended with `_unix`'                  ,                  default                  =                  None                  )                  # Parse the args (argparse automatically grabs the values from                  # sys.argv)                  args                  =                  parser                  .                  parse_args                  ()                  s_file                  =                  args                  .                  source_file                  d_file                  =                  args                  .                  dest_file                  # If the destination file wasn't passed, then assume we want to                  # create a new file based on the one-time 1                  if                  d_file                  is                  None                  :                  file_path                  ,                  file_extension                  =                  os                  .                  path                  .                  splitext                  (                  s_file                  )                  d_file                  =                  f                  '                  {                  file_path                  }                  _unix                  {                  file_extension                  }                  '                  dos2unix                  (                  s_file                  ,                  d_file                  )                              

Tips and Tricks

Now that you've mastered the basics of reading and writing files, here are some tips and tricks to help you abound your skills.

__file__

The __file__ aspect is a special aspect of modules, similar to __name__. Information technology is:

"the pathname of the file from which the module was loaded, if it was loaded from a file." (Source

Here's a real world example. In one of my by jobs, I did multiple tests for a hardware device. Each test was written using a Python script with the exam script file proper noun used every bit a title. These scripts would then be executed and could print their status using the __file__ special attribute. Here's an case folder structure:

                                project/ | ├── tests/ |   ├── test_commanding.py |   ├── test_power.py |   ├── test_wireHousing.py |   └── test_leds.py | └── main.py                              

Running main.py produces the following:

                                >>> python main.py tests/test_commanding.py Started: tests/test_commanding.py Passed! tests/test_power.py Started: tests/test_power.py Passed! tests/test_wireHousing.py Started: tests/test_wireHousing.py Failed! tests/test_leds.py Started: tests/test_leds.py Passed!                              

I was able to run and get the status of all my tests dynamically through apply of the __file__ special attribute.

Appending to a File

Sometimes, you may want to append to a file or starting time writing at the end of an already populated file. This is hands done by using the 'a' character for the mode argument:

                                                  with                  open                  (                  'dog_breeds.txt'                  ,                  'a'                  )                  as                  a_writer                  :                  a_writer                  .                  write                  (                  '                  \n                  Beagle'                  )                              

When you examine dog_breeds.txt once more, yous'll see that the commencement of the file is unchanged and Beagle is now added to the end of the file:

>>>

                                                  >>>                                    with                  open up                  (                  'dog_breeds.txt'                  ,                  'r'                  )                  equally                  reader                  :                  >>>                                    print                  (                  reader                  .                  read                  ())                  Pug                  Jack Russell Terrier                  English language Springer Spaniel                  German Shepherd                  Staffordshire Bull Terrier                  Cavalier King Charles Spaniel                  Golden Retriever                  West Highland White Terrier                  Boxer                  Border Terrier                  Beagle                              

Working With Two Files at the Same Fourth dimension

There are times when you may want to read a file and write to another file at the aforementioned time. If you utilize the case that was shown when you were learning how to write to a file, it tin can actually exist combined into the following:

                                                  d_path                  =                  'dog_breeds.txt'                  d_r_path                  =                  'dog_breeds_reversed.txt'                  with                  open                  (                  d_path                  ,                  'r'                  )                  as                  reader                  ,                  open up                  (                  d_r_path                  ,                  'w'                  )                  as                  writer                  :                  dog_breeds                  =                  reader                  .                  readlines                  ()                  writer                  .                  writelines                  (                  reversed                  (                  dog_breeds                  ))                              

Creating Your Ain Context Manager

There may come a time when you'll demand finer control of the file object past placing it inside a custom class. When you do this, using the with argument tin no longer be used unless yous add a few magic methods: __enter__ and __exit__. By adding these, yous'll have created what's chosen a context director.

__enter__() is invoked when calling the with statement. __exit__() is called upon exiting from the with statement block.

Hither'south a template that you can apply to make your custom course:

                                                  class                  my_file_reader                  ():                  def                  __init__                  (                  cocky                  ,                  file_path                  ):                  self                  .                  __path                  =                  file_path                  self                  .                  __file_object                  =                  None                  def                  __enter__                  (                  cocky                  ):                  self                  .                  __file_object                  =                  open                  (                  self                  .                  __path                  )                  return                  self                  def                  __exit__                  (                  self                  ,                  type                  ,                  val                  ,                  tb                  ):                  cocky                  .                  __file_object                  .                  close                  ()                  # Additional methods implemented below                              

Now that y'all've got your custom class that is now a context managing director, y'all can use information technology similarly to the open() built-in:

                                                  with                  my_file_reader                  (                  'dog_breeds.txt'                  )                  equally                  reader                  :                  # Perform custom class operations                  pass                              

Here'south a good example. Remember the cute Jack Russell image nosotros had? Possibly y'all want to open up other .png files merely don't want to parse the header file each time. Hither'due south an example of how to practise this. This example also uses custom iterators. If y'all're non familiar with them, check out Python Iterators:

                                                  grade                  PngReader                  ():                  # Every .png file contains this in the header.  Utilize it to verify                  # the file is indeed a .png.                  _expected_magic                  =                  b                  '                  \x89                  PNG                  \r\northward\x1a\northward                  '                  def                  __init__                  (                  self                  ,                  file_path                  ):                  # Ensure the file has the correct extension                  if                  non                  file_path                  .                  endswith                  (                  '.png'                  ):                  raise                  NameError                  (                  "File must be a '.png' extension"                  )                  cocky                  .                  __path                  =                  file_path                  cocky                  .                  __file_object                  =                  None                  def                  __enter__                  (                  self                  ):                  self                  .                  __file_object                  =                  open                  (                  self                  .                  __path                  ,                  'rb'                  )                  magic                  =                  self                  .                  __file_object                  .                  read                  (                  8                  )                  if                  magic                  !=                  self                  .                  _expected_magic                  :                  raise                  TypeError                  (                  "The File is non a properly formatted .png file!"                  )                  render                  cocky                  def                  __exit__                  (                  self                  ,                  type                  ,                  val                  ,                  tb                  ):                  self                  .                  __file_object                  .                  close                  ()                  def                  __iter__                  (                  self                  ):                  # This and __next__() are used to create a custom iterator                  # See https://dbader.org/blog/python-iterators                  return                  self                  def                  __next__                  (                  self                  ):                  # Read the file in "Chunks"                  # See https://en.wikipedia.org/wiki/Portable_Network_Graphics#%22Chunks%22_within_the_file                  initial_data                  =                  self                  .                  __file_object                  .                  read                  (                  four                  )                  # The file hasn't been opened or reached EOF.  This means we                  # can't go any further so finish the iteration by raising the                  # StopIteration.                  if                  cocky                  .                  __file_object                  is                  None                  or                  initial_data                  ==                  b                  ''                  :                  raise                  StopIteration                  else                  :                  # Each chunk has a len, type, data (based on len) and crc                  # Grab these values and render them as a tuple                  chunk_len                  =                  int                  .                  from_bytes                  (                  initial_data                  ,                  byteorder                  =                  'big'                  )                  chunk_type                  =                  self                  .                  __file_object                  .                  read                  (                  4                  )                  chunk_data                  =                  cocky                  .                  __file_object                  .                  read                  (                  chunk_len                  )                  chunk_crc                  =                  self                  .                  __file_object                  .                  read                  (                  4                  )                  return                  chunk_len                  ,                  chunk_type                  ,                  chunk_data                  ,                  chunk_crc                              

You can at present open up .png files and properly parse them using your custom context manager:

>>>

                                                  >>>                                    with                  PngReader                  (                  'jack_russell.png'                  )                  equally                  reader                  :                  >>>                                    for                  50                  ,                  t                  ,                  d                  ,                  c                  in                  reader                  :                  >>>                                    print                  (                  f                  "                  {                  l                  :                  05                  }                  ,                                    {                  t                  }                  ,                                    {                  c                  }                  "                  )                  00013, b'IHDR', b'5\x121k'                  00001, b'sRGB', b'\xae\xce\x1c\xe9'                  00009, b'pHYs', b'(<]\x19'                  00345, b'iTXt', b"L\xc2'Y"                  16384, b'IDAT', b'i\x99\x0c('                  16384, b'IDAT', b'\xb3\xfa\x9a$'                  16384, b'IDAT', b'\xff\xbf\xd1\n'                  16384, b'IDAT', b'\xc3\x9c\xb1}'                  16384, b'IDAT', b'\xe3\x02\xba\x91'                  16384, b'IDAT', b'\xa0\xa99='                  16384, b'IDAT', b'\xf4\x8b.\x92'                  16384, b'IDAT', b'\x17i\xfc\xde'                  16384, b'IDAT', b'\x8fb\x0e\xe4'                  16384, b'IDAT', b')3={'                  01040, b'IDAT', b'\xd6\xb8\xc1\x9f'                  00000, b'IEND', b'\xaeB`\x82'                              

Don't Re-Invent the Snake

There are common situations that you may encounter while working with files. Most of these cases tin be handled using other modules. Two common file types you may need to piece of work with are .csv and .json. Real Python has already put together some smashing articles on how to handle these:

  • Reading and Writing CSV Files in Python
  • Working With JSON Data in Python

Additionally, there are built-in libraries out there that you tin employ to help you:

  • moving ridge : read and write WAV files (audio)
  • aifc : read and write AIFF and AIFC files (audio)
  • sunau : read and write Sun AU files
  • tarfile : read and write tar annal files
  • zipfile : work with ZIP athenaeum
  • configparser : hands create and parse configuration files
  • xml.etree.ElementTree : create or read XML based files
  • msilib : read and write Microsoft Installer files
  • plistlib : generate and parse Mac Bone X .plist files

There are plenty more than out at that place. Additionally there are even more than third party tools bachelor on PyPI. Some popular ones are the following:

  • PyPDF2 : PDF toolkit
  • xlwings : read and write Excel files
  • Pillow : epitome reading and manipulation

You're a File Wizard Harry!

Yous did information technology! You lot at present know how to piece of work with files with Python, including some advanced techniques. Working with files in Python should now be easier than e'er and is a rewarding feeling when y'all start doing it.

In this tutorial you've learned:

  • What a file is
  • How to open and close files properly
  • How to read and write files
  • Some advanced techniques when working with files
  • Some libraries to work with common file types

If you have any questions, hit us up in the comments.

Watch At present This tutorial has a related video course created by the Real Python team. Picket it together with the written tutorial to deepen your understanding: Reading and Writing Files in Python