I know that many of my readers have tackled large writing projects (books, dissertations, etc.), so I’m hoping you can help me out. As I begin my thesis, I can’t help but look at it as somewhat of a software development project. While the creative process is different, the machinations are similar: I’ll be adding or editing lines (of text) and referencing objects (citations), and I’ll be stylizing and formatting the material. It’s a development process, with different content.
Here’s the problem I face. With software development, I use version control and project management software (Trac), I’ve got multiple backups on different machines, I’ve got builds, tests, etc. With writing, I’ve got a binary file that just keeps getting bigger each day. I don’t have an automatic way of seeing multiple versions, I don’t have software that lets me see the changes between checkins, so on and so forth. This is what I’d generally think of as lacking a “sane” development environment, and it is worrying.
A quick note about my writing flow. For a number of reasons, I do most of my writing in Word. My final document is often processed in LaTeX, but the actual writing and saving parts are done in Word. The problem with this is the writing process is a black box; I can’t see what changes I’ve made every day, I don’t have sane merges, etc. I suppose I could have a semblance of this functionality if I just saved a new copy of the document each day, but the idea of searching through 50 copies of a document to figure out what day I added that part and what I was thinking is just nuts, especially when I compare it to the heads-up display I get with Trac.
I’m willing to adopt a new word processor, I’m willing to write a bunch of scripts that will manage a build process. On the other hand, I’m not willing to write my dissertation in TextEdit (which I suppose is the only real answer if I want to follow a development method, sigh). But since I am new at this, I figure there are some tools or tricks that I’m missing. What I’m looking for is sane versioning, integration with SVN a major plus (this would enable me to Trac my project), maybe some advice on methods or tools that have worked for you. My hope is that this post will help others who stumble upon it, so please consider leaving a comment about what worked for you, etc. Thank you!
Tags: academic, dissertation, writing
Fred Stutzman is a doctoral student, researcher and teaching fellow at the University of North Carolina at Chapel Hill's School of Information and Library Science. He studies how people use social media.





TeXShop
http://www.uoregon.edu/~koch/texshop/
You get the text you crave (for versioning), the layout you eventually want at the end, and the integration with the rest of your work flow since it’s mac (and opensource to boot).
Clearly the answer – also, plays well with BibTeX (and BibDesk).
TeXShop is very good, it is actually what I use for rendering. Let me clarify – I want the benefits of the word processor. I need to think visually as I write and the word processor lets me do that. The idea of writing latex markup as I go just doesn’t appeal (yes, teXshop has lots of macros, but I usually end up writing my own). I also need to be able to send my docs around so other can mark them up, and using a standard word processor will make this process more simple (yeah, people can mark up a pdf but…ehh).
Fred,
Karen Coombs and I just finished the first draft of our upcoming book, and did 99% of the thing using Google Docs. It gave us collaboration, versioning, tracking…everything we needed. We then built the final edits in Word to ensure formatting for the publisher, but almost all of what we needed came with Google.
Jason, I hadn’t thought of Google Docs, but that is an option. Does it work with any citation management software?
I think Word itself may have some version tracking capabilities. Have you poked around the Track Changes section?
Yes – but I find that track changes is optimized for multi-author documents, as opposed to an evolving document. Plus, once I approve the change the text just merges into the document, without leaving any record as far as i can tell.
If you’re interested in other processes or workplans for writing, check out the Gilbert Center’s writing course. My partner is now tackling a huge book project based on this approach, and it’s made a big difference.
It’s quite simple, in essence: define the project as writing a certain number of words, e.g. 50,000, and then determine how many words you need to write per day to meet your deadlines. Editing and other activities can, in principle, be converted into a word count equivalent measure as well.
I haven’t tried it our yet, but Scrivener looks quite interesting.
I wrote my entire thesis using Subversion/Trac for precisely the reasons you state. The process of writing academic work for me is such that bibliographic tools dominate my use cases. Having my citations in order greatly exceeds any other needs (sharing, WYSIWYG, etc.) but I might be a neat freak in that regard.
Endnote never really cut it for me. BibDesk is in a class of its own.
>> Does it [GoogleDocs] work with any citation management software?
If you use the Zotero plug-in in Firefox as your citation manager, you can drag and drop citations into your Google documents. I’ve played with Zotero and Google Docs, but never the two together, myself, but it might work for you.
Hi Fred,
A friend of mine doing his thesis o n wikis said he would use Mediawiki to write it. Guess he didn’t start yet, so I do not know whether it worked fine for him – I think it wouldn’t for me.
Concerning versions, I do save different files each and every time some “differential bunch of content” is added to my thesis. So, I’m not tracking all changes, but the most important I do. When I send files to my supervisor or friends, I save new versions (name_of_version_delivery_xxx) plus a changelog (name_of_version_changelog) where I highlight the main changes. I also keep a file for every commented version I get (name_of_version_delivery_xxx_commented).
Yep, I know this is not “exactly” what you were asking for. Just wanted to share how I more or less solved your same problem.
I use Papers for Bib management (best software ever), and TeXShop — but I never noticed any sophisticated versioning on that.
Word has amazing version control & correction system, and I always thought of improving simple TeX conversion macros (Style Title -> \section{^&}) — although 90% of the fun has to be done by hand most of the time.
I don’t think writing latex is the drawback you think it is. I’m a very visual person, and what I find I have to do for planning and major reorganisation (regardless of what I’m using) is printing the whole thing out. There’s just no comparison to being able to see large chunks of your work at once (6 pages at a time is quite feasible if you’ve got a decent sized desk) and being able to see how to move big chunks of the paper or book to improve flow. Writing on screen is fine for micro changes, and writing when you know what you want, but for anything major I still can’t go past paper and pen.
Scrivener is amazing and accomplishes a lot of what you just described. It is a great writing environment.
Thank you to everyone for the ideas – I am learning a lot. Keep the comments coming!
Sorry if this is a little bit offtopic/fun/frivolous, but Hadley’s comment reminds me of a time were I had to cut and paste (real cut and paste, I mean) some of my citations so I could put them in a coherent order.
I couldn’t succeed doing it on the screen…
You can see the results here:
http://flickr.com/photos/10176016@N03/2046784747/
Sorry Fred ;)
I use the “Google Platform” to help me build and evolve my central “content platform”. For either one project or across all of them.
I too am a software developer so I approach very similar.
Everything starts in Google Notebook, because I can harvest from a web page or blog with just a right click and it keeps the reference to the original piece. I can also just capture my own thoughts as I am surfing using my firefox add-in as well.
I can then organize these “notes” into notebooks by relevancy. I can further refine and handle the evolution by scripts using the Google Notebook API.
Then as information and throughts mature I migrate to Google Docs with a simple right click in Google Notebook.
Once in Google Docs I can refine more formally and with the assistance of others if necessary through collaboration.
Then I can also organize data in spreadsheets and evolve aspects into presentations.
From there I can publish to blogger, PDF, a book or whatever publishing format I wish.
I like the Google Platform because of how the applications compliment and work together, however the API is the biggest piece that allows me to work with things programmatically using scripts.
I’d thought it was too simple for your needs, but since somebody mentioned Google Docs, I thought I’d throw WriteWith in to the mix.
I use it for all my online writing now. In fact, all my writing. But, I’m not writing a dissertation and our use cases are probably very different. Nonetheless, it’s a very sweet tool for getting lots of text onto a page really easily and being able to revert versions simply.
Good luck!
My friend and SDP colleage Ismael asked me if I’ve made a decision regarding workflow…here’s what I wrote to him:
The answer is yes and no. In terms of conclusions, I’ve found that there probably isn’t a single system that supports what I want. I.e. no great integrated system. Word is likely quite robust (esp newer versions, I’m still on Word X), but getting my stuff from Word into a SVN-like flow would require hand-copying and pasting or something like that. Which may end up being the solution.
I’ve also found that citation manager integration seems to trump all. I use BibDesk, which integrates perfectly with TeXShop, but does not integrate with Word. This is likely the largest decision factor…Its hard enough to manage 20-40 references for a manuscript, but no way am I going to try to juggle the 200-300 I’ll use on my thesis.
So what I think will happen is that I’ll do basic drafting in Word, but the majority of my work will be done in TexShop, which will integrate perfectly with SVN and Bibdesk. I need to spend a few hours better understanding classes and latex markup, but once I get over that hurdle I imagine it won’t feel much different than coding html or something like that.
- Thank you to everyone who has contributed to this thread.
Why not try EndNote. I just heard about it and I’ve heard it’s a great tool to write research papers. I would recommend it. I got to use it. And it’s great for citing sources etc.
At the risk of stating the obvious– any tool which saves its data in XML or some other human-readable text markup, will at least make it possible to manage the documents in SVN.
These include, if I am not mistaken: Open Office, Apple Pages, and… newer versions of Word? And of course LaTeX.
Then the question becomes, which ones write their data in a way that is sensible enough for the diffs to be meaningful.
Further examination might reveal that you can achieve useful diffs using some relatively simple processing scripts. For example, maybe one of the formats has a bunch of proprietary tags for all kinds of formatting and meta information that you don’t care about diffing, but it also has a clear place where it puts the “real content”… so you can parse the xml tree, strip out all the <content> tags, and then diff that content.
And by “strip out” I meant, extract and isolate the content of those tags.
Using a citation manager like Zotero plug-in in Firefox , you can drag and drop citations into your Google documents.I think it should work for you.