Part 3 (Integrated Revision)

Introduction¶

If you haven't seen the previous parts of this series, it might be worth going and reading those first.

As mentioned previously, there were a few issues with the LLM generation process, so I'd like to iterate and fix some of those.

Integrated Revision Approach¶

At my university, I was now taking a writing class as part of my general education requirements, and my awesome teacher inspired me to try something - she had been focusing on the revision process, and I realized that it might be good to automate that part too.

Currently, the LLM is expected to generate the whole story first try - with no revision or feedback. If revision was a part of the process, then perhaps some of these plot and writing tone errors could be fixed.

With that in mind, I set out to improve the system some more - with a basic revision loop similar to what students do in classes. They work on writing something, submit it, get feedback and revise.

Thus, I sought to replicate that with another addition to the system, I'd write pseudocode but I think it's time to use a block diagram now since it's a bit complex for pseudocode.

Block diagram of new process

The only trick now was to write good system prompts that would tell the various language models what functions to perform so that the system behaves well.

System Prompts¶

From my learning over this iteration, it seems that prompts which clearly list criteria for the LLM to check over are better. With the help of my english teacher (who generously spent her time helping me with this project) I wound up eventually gravitating towards using more defined goals for the LLMs to follow:

    Prompt = f"""
Please provide a final edit the following chapter based on the following criteria and any previous chapters.
Do not summarize any previous chapters, make your chapter connect seamlessly with previous ones.

For your reference, here is the outline:
<Outline>
{_Outline}
</Outline>

And here is the chapter to tweak and improve:
<Chapter>
{Stage3Chapter}
</Chapter>

As a reminder to keep the following criteria in mind:
    - Pacing: 
      - Are you skipping days at a time? Summarizing events? Don't do that, add scenes to detail them.
      - Is the story rushing over certain plot points and excessively focusing on others?
    - Characters
    - Flow
    - Details: Is the output too flowery?
    - Dialogue
    - Development: Is there a clear development from scene to scene, chapter to chapter?
    - Genre
    - Disruptions/conflict


"""

For a lot of debugging here, I had to do things that focused around running it, seeing the output and playing whack-a-mole with the system prompts to fix it. Over time, I started to see common issues and tried to fix those systemically rather than fixing them one at a time.

The only tricky part was the slow generation times, it now took over ten hours per generation, so it was starting to get a bit painful.

Just for reference, here's what I was using to run this in my dorm room (I bet my university loved the power bill):

Image of Dell Poweredge R740 server with 3x NVIDIA Tesla P40 GPUs

I'm really not sure how to improve the generation time - although that isn't my primary focus with this project - I don't really care if it takes a day or two to generate something, as long as it's good. I'm more interested in quality over quantity, but it does make debugging hard when generation is so slow.

Okay, now let's move onto what worked with this approach and what didn't.

What worked¶

Length - it's starting to get closer to the lower bound here, but I think that can be fixed easily if needed, I'm more interested in seeing better output for now.
Plot - there was a dramatic improvement in the plot of the resulting story - it was vastly better (at least in the outline) than before
Characters - with appropriate prompting, there even seemed to be some character development now, which was great to see
Grammar - the grammatical structure was still correct as always.
Sanity - no issues here
Writing Style - the writing style (with the right prompting) was also much better, it wasn't great, but it was okay

What didn't work¶

Pacing - sometimes it spent hundreds of words describing the color of the sky, other times it skipped over entire battle scenes with just a few words.
Word Choice - So many token phrases 'the tension was palpable' or 'as days turned to weeks'. I never want to read those ever again.
Chapter Consistency - this is a new problem, before it was only writing basically one chapter, but now since we are doing multiple generations, it tended to forget what happened in previous chapters and write totally different and disjointed things
Output Formatting - now the LLMs would sometimes add "here's your revised chapter..." or something like that to their output, which would break immersion.

Conclusion¶

So adding the revision process definitely helped a lot with this output - there were of course many issues with the output still, but it was much better than before.

I had sort of run out of ideas at this point, so I focused on perhaps picking off the low-hanging fruit - fixing the output formatting next.