I really love Python's csv module. But I do wish it was a little better documented.
The DictWriter lets you write CSV files very neatly and semantically by defining each row as a Python dict.
Here's the example right out of the docs:
import csv
with open('names.csv', 'w', newline='') as csvfile:
fieldnames = ['first_name', 'last_name']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerow({'first_name': 'Baked', 'last_name': 'Beans'})
writer.writerow({'first_name': 'Lovely', 'last_name': 'Spam'})
writer.writerow({'first_name': 'Wonderful', 'last_name': 'Spam'})
Alas, it overlooks what I would consider a (if not the) standard use case, the one I come across all the time, in which I don't write the rows out one by one literally, but in a loop, more like:
import csv
data = [('Baked', 'Beans'),
('Lovely', 'Spam'),
('Wonderful', 'Spam'),
]
with open('names.csv', 'w', newline='') as csvfile:
fieldnames = ['first_name', 'last_name']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for datum in data:
writer.writerow({'first_name': datum[0],
'last_name': datum[1]})
Of course:
- The source won't generally be a list of tuples, rather an iterator of objects more generally, and I'm dumping some of their properties.
- I will be working with a **much **longer list of fieldnames.
which, raises the spectre of repeating the same list of field names, most definitely not DRY, and highly undesirable and difficult to maintain as I tweak the list of fields I want to dump to CSV.
And yet I find no documentation that gets around that. So, having just nutted one out and tested it, it's worth putting down in a document (right here and now).
The Problem
The problem in a nutshell is that csv.DictWriter
demands to know the fieldnames
(and writer.writeheader()
needs to have them known), but they are specified in the dictionary build inside the loop. fieldnames
is not even an optional argument to csv.DictWriter
and the writer
is poorly documented.
The Solution
The solution rests in two empirically determined (for lack of documentation) facts:
-
csv.DictWriter
acceptsfieldnames=None
- the
writer
if returns has afieldnames
attribute that can be set post creation.
To wit, this works beautifully:
import csv
data = [('Baked', 'Beans'),
('Lovely', 'Spam'),
('Wonderful', 'Spam'),
]
with open('names.csv', 'w', newline='') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=None)
for i, datum in enumerate(data):
row = {'first_name': datum[0],
'last_name': datum[1]}
if i == 0:
writer.fieldnames = row.keys()
writer.writeheader()
writer.writerow(row)
And now the row keys are not doubly specified. And the CSV file receives its header row.
A simple paradigm once discovered, I am using for rapidly dumping CSV files describing objects, mostly for testing and study purposes. It means I can play with the row
definition in situ, adding rows and changing rows etc, without having to change them in two places and I get the benefit of a DictWriter and its simple syntax for writing CSV files.
Top comments (2)
Lovely solution. However, is there an editing mistake in the final line of the solution? Shouldn't it be "writer.writerow(row)"?
Indeed. Thank you muchly, for spotting it. Just fixed it. Am on a phone though. Looks good to me, but the phone screen is hard to edit on alas.