10

In Python, how can I keep track of my position in a .csv file, in the case of an...

 2 years ago
source link: https://www.codesd.com/item/in-python-how-can-i-keep-track-of-my-position-in-a-csv-file-in-the-case-of-an-exception-during-a-for-loop-that-reads-each-line.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

In Python, how can I keep track of my position in a .csv file, in the case of an exception, during a for loop that reads each line?

advertisements

I am making a few thousand web requests, but I can't complete it all at once. How can I save my position, possibly using try and except, in the case of a urllib2.HTTPError: HTTP Error 400: Bad Request? I would like to use time.sleep for five minutes, or something when the error occurs, and continue where it left off. I am not sure how to continue where I left off. I considered making a list and then popping elements from it, but that seems cumbersome. Any suggestions?

Here's the meat of my code:

with open('csvlist.csv', 'rb') as data:
    reader = csv.reader(data)
    for row in reader:
        retrieveAdd(row[0])

Where retrieveAdd makes the web request and adds some data to a database. I tried sleeping after every 100 requests (below) but it didn't work.

with open('csvlist.csv', 'rb') as data:
    reader = csv.reader(data)
    count = 0
    for row in reader:
        retrieveAdd(row[0])
        count += 1
        if count % 100 == 0:
            time.sleep(180)


Why not wrap the "retry after 5 minutes" operation in a function?

def retrieveAddSafe(data, repeat=5):
    """ Attempts to retrieve `data`, swallowing HTTPErrors `repeat` times before
    throwing"""
    for _ in xrange(repeat - 1):
        try:
            return retrieveAdd(data)
        except urllib2.HTTPError:
            time.sleep(5 * 60)

    # if it fails after `repeat` times, allow the error to be raised
    return retrieveAdd(data)

with open('csvlist.csv', 'rb') as data:
    reader = csv.reader(data)
    for count, row in enumerate(reader):
        retrieveAddSafe(row[0])


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK