I never thought about the character entity references too much. For XML I assumed at least the 5 predefined ones for character & <> ” and ‘. Naively I assumed this for HTML as well (plus the usual other ones used for years like entities for german umlauts etc).
For more details see e.g. http://lachy.id.au/log/2005/10/page/2/.
Not easy to write this post in WordPress BTW, the HTML editor keeps changing the entities or does weird things with them…
Interesting article at O’Reilly “Understanding Newlines” which is written for Perl but holds true for Python, too. I knew about possible problems but did not run into any problems with newlines yet. But the actual details I did not know completely and so understand why Subversion complains about a mix of newline characters in files I have checked in (which probably was coworkers and me use different editors with even different option settings). Now I also understand what textmode opening of files actually does. This may seem ignorant but until now I almost always worked with textfiles anyway which do not cause problems when using the default textmode filemode of Python and other languages…
Anyway, interesting article, also because some infos about unicode newline characters.
I guess there are quite a few tools to process Excel with Python but I found (and used) the following which are easy to use and were sufficient at least for my humble needs…
To just read an Excel spreadsheet a recipe in the Python Cookbook is very useful.
To actually produce sheets the pyexcelerator at SourceForge is very nice. Helpful usage instruction may be found here.
Another even simpler (no lib needed) possibility is to actually produce a simple HTML file with just an HTML table in it and save it as *.xls. Anything like simple “print” statements (taking care of proper escaping of < and & for example) or (preferably) any XML producing library should do fine. Just open this in Excel and even stuff like coloring (with td/@bgcolor), aligning (td/@align) or row/colspan just works. Unfortunately this was not my idea but that of a colleaque but very easy to use but very powerful too.
Another of those entries to remind myself