{

I had posted about the set operator in Python with some questions. All that changed today when I wrote a little script to remove duplicate lines from a file. The set operator takes a list and automatically gets rid of duplicate items. Very useful for situations like this:


#!/usr/bin/env python

f = open("c:\\temp\\Original.txt")
f2 = open("c:\\temp\\Unique.txt", "w")
uniquelines = set(f.read().split("\n"))
f2.write("".join([line + "\n" for line in uniquelines]))
f2.close()


}

Comments

Ramen
But the problem is the set operator automatically sort after removing the duplicate...What if we don't want to change the order...
Covert Assassin
Hi.. I just used this one to eliminate duplicates in my file. I would like to know more about this function set. I'm seriously left wondering how did such few LOC do that perfectly?! Any thoughts?
G.T. Rajpurohit
it is great.
but it alter the sequence of file in it
Admin
Superb!

I had a database in filemaker with 250000 records and had written a script to delete duplicates. I was looking at around 24 hours to run that in Filemaker.

Exported it as a CSV and ran your code in Python. Took less than 3 seconds and spat out a CSV that I imported back to filemaker.

Thank you!