Head of Digital Marketing @Bronco. Lover of Python and Data.

Working With Csv Files

When working with large datasets I tend to use Python as it's a lot faster then excel for file manipulations and doesn't crash on large inputs.

One thing I struggled with in the past was column selection. I have spoken to lots of different people and done plenty of reading on the subject but I think I have found the most elegant solution and even better it's using the standard library - not Pandas that everyone seems to suggest.

Here it is...


import csv # no pip needed

with open("file.csv") as f:
    data = csv.reader(f)
    for line in data:
        print line[0],line[1],line[2],line[3] # specifies the column number

Beautiful, isn't it?

Now lets say you want to label each column as you print them...


import csv

with open("file.csv") as f:
    data = csv.reader(f)
    for line in data:
        print " column1: {0} , column2: {1}, column3: {2}, column4: {3}".format(line[0],line[1],line[2],line[3])

Or maybe you want to search from something within a column to check it exists?


import csv

with open("file.csv") as f:
    data = csv.reader(f)
    for line in data:
        if 'something' in line[2]:
            print 'found', line[2]

Here is a more job specific example, say if you wanted to count anchor text frequency of a backlink profile...


import csv
import collections

lis = []

with open("file.csv") as f:
    data = csv.reader(f)
    for line in data:
        lis.append(line[3])
        counter = collections.Counter(lis)

for word, value in counter.iteritems():
    print word, value


Give Your Inbox Some Love


What You'll get?
  • Posts like this delivered straight to your inbox!
  • Nothing else.
Comment Policy

Any code snippets more than a line or 2, please include as a link to a gist

comments powered by Disqus