<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: CSV Parsing: Haskell versus Python</title>
	<atom:link href="http://techguyinmidtown.com/2008/07/14/csv-parsing-haskell-versus-python/feed/" rel="self" type="application/rss+xml" />
	<link>http://techguyinmidtown.com/2008/07/14/csv-parsing-haskell-versus-python/</link>
	<description>the notebook of a computer scientist living in midtown manhattan</description>
	<lastBuildDate>Sun, 11 Dec 2011 20:21:06 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Josh</title>
		<link>http://techguyinmidtown.com/2008/07/14/csv-parsing-haskell-versus-python/#comment-97</link>
		<dc:creator><![CDATA[Josh]]></dc:creator>
		<pubDate>Tue, 25 Nov 2008 19:42:16 +0000</pubDate>
		<guid isPermaLink="false">http://techguyinmidtown.wordpress.com/?p=30#comment-97</guid>
		<description><![CDATA[Did I really forget to carry the zeroes?  D&#039;oh!

Thanks for the correction...]]></description>
		<content:encoded><![CDATA[<p>Did I really forget to carry the zeroes?  D&#8217;oh!</p>
<p>Thanks for the correction&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Greg</title>
		<link>http://techguyinmidtown.com/2008/07/14/csv-parsing-haskell-versus-python/#comment-96</link>
		<dc:creator><![CDATA[Greg]]></dc:creator>
		<pubDate>Wed, 19 Nov 2008 00:37:55 +0000</pubDate>
		<guid isPermaLink="false">http://techguyinmidtown.wordpress.com/?p=30#comment-96</guid>
		<description><![CDATA[Josh,

Well, the file had 157562 lines and is 9779275 bytes in size.  So yeah, that&#039;s about 62 bytes per line.  But processing that number of lines in 1.7 seconds is only 5.5MBps.  So, by that math, it&#039;s megas, not gigas.

Greg]]></description>
		<content:encoded><![CDATA[<p>Josh,</p>
<p>Well, the file had 157562 lines and is 9779275 bytes in size.  So yeah, that&#8217;s about 62 bytes per line.  But processing that number of lines in 1.7 seconds is only 5.5MBps.  So, by that math, it&#8217;s megas, not gigas.</p>
<p>Greg</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Josh</title>
		<link>http://techguyinmidtown.com/2008/07/14/csv-parsing-haskell-versus-python/#comment-95</link>
		<dc:creator><![CDATA[Josh]]></dc:creator>
		<pubDate>Wed, 19 Nov 2008 00:22:53 +0000</pubDate>
		<guid isPermaLink="false">http://techguyinmidtown.wordpress.com/?p=30#comment-95</guid>
		<description><![CDATA[160,000 records, well over over 50-bytes a peice, in 1.7 seconds?

That&#039;s almost 5GB/sec.

Either I&#039;m misunderstanding something, or the MacBook&#039;s hard drive is apparently over 5 times faster than $50,000.00 RAID arrays currently on the market.  That&#039;s amazing!]]></description>
		<content:encoded><![CDATA[<p>160,000 records, well over over 50-bytes a peice, in 1.7 seconds?</p>
<p>That&#8217;s almost 5GB/sec.</p>
<p>Either I&#8217;m misunderstanding something, or the MacBook&#8217;s hard drive is apparently over 5 times faster than $50,000.00 RAID arrays currently on the market.  That&#8217;s amazing!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kelila</title>
		<link>http://techguyinmidtown.com/2008/07/14/csv-parsing-haskell-versus-python/#comment-93</link>
		<dc:creator><![CDATA[Kelila]]></dc:creator>
		<pubDate>Tue, 28 Oct 2008 17:46:33 +0000</pubDate>
		<guid isPermaLink="false">http://techguyinmidtown.wordpress.com/?p=30#comment-93</guid>
		<description><![CDATA[Good words.]]></description>
		<content:encoded><![CDATA[<p>Good words.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Christine Fenton</title>
		<link>http://techguyinmidtown.com/2008/07/14/csv-parsing-haskell-versus-python/#comment-77</link>
		<dc:creator><![CDATA[Christine Fenton]]></dc:creator>
		<pubDate>Fri, 25 Jul 2008 16:22:10 +0000</pubDate>
		<guid isPermaLink="false">http://techguyinmidtown.wordpress.com/?p=30#comment-77</guid>
		<description><![CDATA[Never, *ever*, **ever** unpack ByteStrings to Strings. That&#039;s the whole point of ByteStrings! If you unpack, you may as well just use `[Char]`. Stick to ByteStrings only and you get [this kind of result](http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&amp;lang=all).]]></description>
		<content:encoded><![CDATA[<p>Never, *ever*, **ever** unpack ByteStrings to Strings. That&#8217;s the whole point of ByteStrings! If you unpack, you may as well just use `[Char]`. Stick to ByteStrings only and you get [this kind of result](http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&amp;lang=all).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Don Stewart</title>
		<link>http://techguyinmidtown.com/2008/07/14/csv-parsing-haskell-versus-python/#comment-75</link>
		<dc:creator><![CDATA[Don Stewart]]></dc:creator>
		<pubDate>Sun, 20 Jul 2008 00:33:30 +0000</pubDate>
		<guid isPermaLink="false">http://techguyinmidtown.wordpress.com/?p=30#comment-75</guid>
		<description><![CDATA[Note there&#039;s a ByteString-based CSV parser on hackage now,

   http://hackage.haskell.org/cgi-bin/hackage-scripts/package/bytestring-csv

which is significantly faster than Text.CSV. That, along with the bytestring-lex package for parsing floating point values from ByteStrings should mean your program is able to be both concise and fast.]]></description>
		<content:encoded><![CDATA[<p>Note there&#8217;s a ByteString-based CSV parser on hackage now,</p>
<p>   <a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/bytestring-csv" rel="nofollow">http://hackage.haskell.org/cgi-bin/hackage-scripts/package/bytestring-csv</a></p>
<p>which is significantly faster than Text.CSV. That, along with the bytestring-lex package for parsing floating point values from ByteStrings should mean your program is able to be both concise and fast.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stan Domeshok</title>
		<link>http://techguyinmidtown.com/2008/07/14/csv-parsing-haskell-versus-python/#comment-47</link>
		<dc:creator><![CDATA[Stan Domeshok]]></dc:creator>
		<pubDate>Tue, 15 Jul 2008 17:16:54 +0000</pubDate>
		<guid isPermaLink="false">http://techguyinmidtown.wordpress.com/?p=30#comment-47</guid>
		<description><![CDATA[Real World Haskell wrote a full CSV parser with Parsec, whole thing ended up being about 20 lines of code, you might want to take a look at it. 
http://book.realworldhaskell.org/beta/parsec.html]]></description>
		<content:encoded><![CDATA[<p>Real World Haskell wrote a full CSV parser with Parsec, whole thing ended up being about 20 lines of code, you might want to take a look at it.<br />
<a href="http://book.realworldhaskell.org/beta/parsec.html" rel="nofollow">http://book.realworldhaskell.org/beta/parsec.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Greg</title>
		<link>http://techguyinmidtown.com/2008/07/14/csv-parsing-haskell-versus-python/#comment-46</link>
		<dc:creator><![CDATA[Greg]]></dc:creator>
		<pubDate>Tue, 15 Jul 2008 16:32:13 +0000</pubDate>
		<guid isPermaLink="false">http://techguyinmidtown.wordpress.com/?p=30#comment-46</guid>
		<description><![CDATA[@author: Ok, as a temporary workaround, you may add timestamps before each &quot;class&quot; of action in order to find what is taking to much time. I think it&#039;s important to find and focus on the code that mainly slow all your program.]]></description>
		<content:encoded><![CDATA[<p>@author: Ok, as a temporary workaround, you may add timestamps before each &#8220;class&#8221; of action in order to find what is taking to much time. I think it&#8217;s important to find and focus on the code that mainly slow all your program.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Reid</title>
		<link>http://techguyinmidtown.com/2008/07/14/csv-parsing-haskell-versus-python/#comment-45</link>
		<dc:creator><![CDATA[Reid]]></dc:creator>
		<pubDate>Tue, 15 Jul 2008 15:28:02 +0000</pubDate>
		<guid isPermaLink="false">http://techguyinmidtown.wordpress.com/?p=30#comment-45</guid>
		<description><![CDATA[You realize that Python has a csv module as part of the standard library.  All you really need is (I think roughly) this:

import csv

for row in csv.reader(&#039;myfile.csv&#039;):
    # do stuff with row.

To make it even better, you can go so far as to use a DictReader which lets you use the row like a dictionary with real string names.  And then if you wanted to go further, you could wrap that in a class that lets you use regular attribute lookup.  But if you want it to go really fast, you name you indices like so:

_ticker = 0
_bid = 1
...

And your code would still be readable.]]></description>
		<content:encoded><![CDATA[<p>You realize that Python has a csv module as part of the standard library.  All you really need is (I think roughly) this:</p>
<p>import csv</p>
<p>for row in csv.reader(&#8216;myfile.csv&#8217;):<br />
    # do stuff with row.</p>
<p>To make it even better, you can go so far as to use a DictReader which lets you use the row like a dictionary with real string names.  And then if you wanted to go further, you could wrap that in a class that lets you use regular attribute lookup.  But if you want it to go really fast, you name you indices like so:</p>
<p>_ticker = 0<br />
_bid = 1<br />
&#8230;</p>
<p>And your code would still be readable.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Erhan Hosca</title>
		<link>http://techguyinmidtown.com/2008/07/14/csv-parsing-haskell-versus-python/#comment-43</link>
		<dc:creator><![CDATA[Erhan Hosca]]></dc:creator>
		<pubDate>Tue, 15 Jul 2008 12:08:00 +0000</pubDate>
		<guid isPermaLink="false">http://techguyinmidtown.wordpress.com/?p=30#comment-43</guid>
		<description><![CDATA[i don&#039;t think Haskell is designed for parsing CSV files ... its not a language with emphasis on I/O ...

i&#039;m not sure why you&#039;re using it for that purpose ... in the financial space even :)]]></description>
		<content:encoded><![CDATA[<p>i don&#8217;t think Haskell is designed for parsing CSV files &#8230; its not a language with emphasis on I/O &#8230;</p>
<p>i&#8217;m not sure why you&#8217;re using it for that purpose &#8230; in the financial space even :)</p>
]]></content:encoded>
	</item>
</channel>
</rss>
