Presented by

  • Paco Xander Nathan

    Paco Xander Nathan
    @pacoid
    https://derwen.ai/paco

    Paco Nathan is the Managing Partner at Derwen, Inc. Known as a "player/coach", with core expertise in graph technologies, natural language, data science, cloud computing. Paco has ~40 years tech industry experience, ranging from Bell Labs to early-stage start-ups. Board member for Argilla.io; Advisor for KUNGFU.AI. Lead committer on PyTextRank, kglab. Formerly: Director, Community Evangelism for Apache Spark at Databricks; Director, Learning Group at O'Reilly Media. See: https://derwen.ai/paco

Abstract

There's been an explosion in the space of language models, generative AI, and other machine learning related to natural language. Going "beyond the headlines", this talk shows how to leverage open source libraries in Python to work with text and image content, from the perspective of an author, editor, or illustrator. We'll look at how to leverage advanced and "AI-adjacent" tooling such as language models, data annotation, graph technologies, interactive visualizations, etc., for assisting creators to understand the content better and collaborate more effectively. For example: What are the themes discussed? Who are the characters? What are the relationships between the characters? Where was concept "XYZZY" first introduced? Did the "Blarg" character actually purchase a quantum transmogrifier before its first use in the story? How do the themes within the content map to the beats in the story arc? What is the "concept density" per chapter, in terms of the pace of new ideas being introduced? How can language models help suggest or refine the prompt engineering used for illustrating a story? Where are illustrations needed? These approaches apply in the production of fiction, as well as games, movie scripts, plays, documentaries, and various non-fiction as well. We'll review an example: development of an ebook in the style of Japanese Light Novels (a cli-fy novel "Latent Space") where artists experimented with collages using components from generative AI, prompts needed to be tracked, themes images needed to be aligned with text themes, and so on. Python provides a wide range of available tooling (`spaCy`, `argilla`, `huggingface`, `pyvis`, and so on) as well as data infrastructure tooling to support content work at scale.