yam655.com

       
       
       ============
       Introduction
       ============
       
       Recently, I made a music video for one of my songs. I made the video
       using Python 3 and Asciimatics_.
       
       I'm interested in lo-fi videos, and this was a first experiment.
       While my experiment existed solely within the bounds of a terminal
       window, there were a number of lessons learned through the process.
       
       These lessons will help me, as well as others who are interested in
       performing similar lo-fi experiments.
       
       ============================
       Use a real time (not frames)
       ============================
       
       Asciimatics uses a single integer "frame" counter.
       
       Regardless of how fast the screen is updating, it's easier to deal with
       simple seconds and fractional seconds. Approximate timing is okay, especially
       if we can readily use timing data gathered from other sources.
       
       For instance, there's timing data in closed captions. It's also easy to create
       timing data in something like Audacity. All of these use either human time or
       seconds and fractional seconds.
       
       Leverage your other tools by dropping the notion of frames. If you
       _really_ need to be frame-precise, consider using a frame-separator in
       other timestamps, maybe using an '@' as a separator between the seconds
       and frames.
       
       Idealized timeline
       ~~~~~~~~~~~~~~~~~~
       
       A show can be thought of as a series of variable-length scenes, strung together.
       
       In my music video, I had a start screen before the music started.
       
       You need to be with flexible intro and outro content, while also fully supporting
       binding the video to the audio's location. In most cases, you may be able to get
       away with having a "starting time" for a scene which is simply subtracted to all
       action in a scene.
       
       If I have three scenes, A, B, and C, and I know they start ten minutes apart,
       it could be something as simple as::
       
           scene_a = Scene(0)
           scene_a.at('0:05', do_it)
           # ... add 10 minutes of content (originally only 5)
           scene_a.at('5:05', part_of_scene_a)
           scene_b = Scene('5:00')
           scene_b.at('5:05', part_of_scene_b)
           # ... add 15 minutes of content
           scene_c = Scene('25:00')
           scene_c.at('25:05', do_it)
           show = [ scene_a, scene_c, scene_b ]
       
       Since each scene can know its own starting point, it can keep the scene timing
       consistent, even if the order of the scenes themselves change or the earlier
       scenes change length.
       
       Do you want to add a commercial? You shouldn't have to dick with timings.
       Just create the scene and stick it in the show::
       
           commercial = Scene('3:00')
           show = [ scene_a, commercial, scene_c, scene_b ]
       
       That sort of scene movement only works when music is bound to scenes, and not
       the whole show, of course.
       
       Going further
       -------------
       
       For an audio book or other long-form audio stream, you should be able to grab
       the audio file and split it up in to separate scenes.
       
       Getting markers for splits is, as mentioned, easy enough to gather in Audacity,
       but -- while you can split things up in Audacity, it shouldn't be required.
       
       If you need to use Audacity anywhere to find the split-points, it's not
       really a time-saver. If, however, you can gather split-points within the
       application, things get more interesting.
       
       Splitting audio at scene-breaks allows you to use scene-breaks as explicit
       restart points when iterating on a scene. It's faster and easier to only
       allow jumping forward and backward at scene changes, as you know the screen
       will start from black.
       
       This means on the back-end, we'll need to track the time of the scene
       change for the audio to support this, anyway. If we have the data, we
       should support splicing a new scene in to that location.
       
       Features and timeline
       ~~~~~~~~~~~~~~~~~~~~~
       
       mvp:
       
           Timeline using real time units. The line between "scene" and
           "show" can be blurry or not exist.
       
       mvp+1:
       
           Shows made of series of stitched-together scenes. Scenes described
           with starting times that may not map to their play-time.
       
       mvp+2:
       
           Scenes and timelines integrate with long audio tracks and
           arbitrary starting points within those tracks.
       
       ==========================
       Synchronize with the audio
       ==========================
       
       My first experiment used PyGame_ to run the song. This back-end is designed for
       background music in games.
       
       You need to be able to query the audio to see where it is. If the audio isn't
       where it is expected to be, you need to hold everything up until it catches up.
       
       PyGame doesn't support this. It's more of a fire-and-forget service.
       
       Idealized timeline
       ~~~~~~~~~~~~~~~~~~
       
       In the very least, you need to delay the start of a scene until the
       audio starts moving.
       
       The disadvantage of audio running in a separate thread (as is normally
       done) is that it may not be at the same place as the animation thread.
       The play speed shouldn't have glitches, but the start times can be
       a bit wobbly.
       
       At the very least, you need to support pausing until the audio is
       ready. Having all sense of time come from the audio goes one step
       further, as it makes the primary timekeeper the audio system.
       
       Some systems (such as PyGame) have a distinction between a sound effect
       that is loaded entirely in to memory and a streamed background music
       file.
       
       Even if you're technically dealing with background music, getting the
       timing right may require loading more of the file in to memory more of
       the time. Accept that you may need a whole song in memory, and that
       you can only reasonably change this during scene breaks.
       
       Going further
       -------------
       
       You should be able to do your final rendering non-real-time, so you'll
       always be properly synchronized.
       
       Non-real-time is the ideal for keeping audio and video synchronized.
       It allows you to bite off small pieces of audio at a time and know
       that everything will line up.
       
       Features and timeline
       ~~~~~~~~~~~~~~~~~~~~~
       
       mvp:
       
           Preload entire audio file in to memory and avoid streaming from
           disk. 
       
           Try to change scenes or cameras at background song boundaries.
           Keep these as isolated units, knowing they'll get stitched together
           during editing.
       
       mvp+1:
       
           Start of video is delayed until the audio starts playing.
       
           If each scene is independent, a scene may have a pause to start. This
           can be corrected in post, as needed, but should keep audio and video
           consistent.
       
           It might be useful to have the video's sense of time to come from the core
           background track, I'm not sure that's 100% needed without further testing.
       
       mvp+2:
       
           Non-realtime rendering insures that audio and video is always synchronized.
       
           This is by far the gold standard. Ideally, you can do this at faster than
           real time.
       
       ======================
       ASCII as a visual form
       ======================
       
       I used big Figlet_ ASCII Art fonts for my test video.
       
       Monitors are bigger and higher resolution than ever before, right?
       
       But this is really what you need. Huge text, even in text mode.
       
       Some of the viewers will watch it full-screen, sure, but a significant
       population will half-distractedly watch a thumbnail instead.
       
       If it's a silent film, folks will need to go larger to read what is
       going on. So, in a silent film context a text-based Roguelike user
       experience may still work? Further experiment is required.
       
       Still, consider going for an older aesthetic and angling for 40x25 (or
       thereabouts) instead of something more modern. 
       
       Idealized visual form
       ~~~~~~~~~~~~~~~~~~~~~
       
       I'm still thinking about old school RPGs.
       
       Fixed camera at best. Top-down maps. A few fixed expressions in close-up.
       Maybe a giant close-up like you find in visual novel games. 
       
       And a dedicated section for dialog to appear.
       
       Maybe menu-style alternative dialog, of course this would be just a fake,
       but it would be easy flavor.
       
       It would be mostly tile-based with a few larger graphics now and then.
       
       It would probably be less than 40 tiles wide. The Roguelike people have
       to make a lot of compromises about visible map size versus map quality,
       so if you're curious about the how and why, you can always look there.
       
       Going further
       -------------
       
       Honestly, I'd really like to have something like `The Sims`_ where instead
       of semiautonomous entities, you just had actors you could control and
       play and rewind their time.
       
       There is MakeHuman_ which provides an open-source method to generate and
       render humans. It has a lot of output formats.
       
       I wouldn't mind using the entire virtual worlds of The Sims, though.
       If we had the capacity to use assets from The Sims, (on-par with,
       say, other open-source games that require comercial assets), it would
       allow us to use the third-party assets as well, of which there are
       considerable and some with decent licenses.
       
       There are other 3D games we might be able possibly to leverage, but few
       are designed for normal, ordinary world stuff like The Sims.
       
       `Garry's Mod`_ might technically work, but modifying maps is a fair bit
       more complicated, and it uses a commercial engine... Then there's the mod
       community that mostly just steals stuff from commercial games and is
       full of fascists... Not very appealing.
       
       Features and timeline
       ~~~~~~~~~~~~~~~~~~~~~
       
       mvp:
       
           Modeled after an RPG, or a text-based Roguelike. A dedicated place
           for the dialog. The right versus left, main character versus
           whomever being talked to. It's an easy UX to write that's flexible
           for many types of stories.
       
       mvp+1:
       
           It's possible to experiment with 3D without actually having a 3D
           game. The portraits can be animated 3D models, there can be
           cut-scenes. These, too, are standard components of games.
       
       mvp+2:
       
           This would be a 3D video, so more like a silent cartoon. Instead
           of the interface having a dedicated place for dialog, it would be
           handled more like standard closed captions.
       
       =========
       Phase One
       =========
       
       Timeline using real time units. The line between "scene" and
       "show" can be blurry or not exist.
       
       Preload entire audio file in to memory and avoid streaming from
       disk. 
       
       Try to change scenes or cameras at background song boundaries.
       Keep these as isolated units, knowing they'll get stitched together
       during editing.
       
       Modeled after an RPG, or a text-based Roguelike. A dedicated place
       for the dialog. The right versus left, main character versus
       whomever being talked to. It's an easy UX to write that's flexible
       for many types of stories.
       
       
       Visual Idea
       ~~~~~~~~~~~
       
       Here's an idea for a roguelike visual (since they map to documents
       easier)::
       
           +----------------------------------------+
           |                 ",,,,.........."       |
           |                 ",,,,.,,""""".."       |
           |                 "####'##"   "..*""*    |
           |                  #...AB#    *....."    |
           |                  #.....#    """*..*"""*|
           |                #####D######    "".....0|
           |                #..........#     ""*"""*|
           |                #...>......#            |
           |                ############            |
           |                                        |
           +----------------------------------------+
           |Betty can:                              |
           |  signal to Ada to leave, ASAP.         |
           |> ask for garlic (nicely).             <|
           |  mock the blood on his necktie.        |
           |                                        |
           +----------+-----------------------------+
           | Ada      |Dracula: Good evening!       |
           |>Betty   <|Ada: We're here to fix your  |
           |          | computers.                  |
           |          |Dracula: The basement is     |
           |          | over here!                  |
           +----------+-----------------------------+
           
       
           +----------------------------------------+
           |                 """"""""""""""""       |
           |                 ",,,,.........."       |
           |                 ",,,,.,,""""".."       |
           |                 "####+##"   "..*""*    |
           |                             *....."    |
           |                             """*..*"""*|
           |                                ""...AB0|
           |                                 ""*"""*|
           |                                        |
           |                                        |
           |                                        |
           |                                        |
           |                                        |
           +You see:--+Near Old House---------------+
           |0: to car |The house appears ancient    |
           |          | with fine, hand-crafted     |
           +----------+ details now falling to ruin.|
           |>Ada     <|Betty: We're lost, Ada.      |
           | Betty    | Admit it!                   |
           |          |Ada: We're not lost! We're...|
           |          | ... Alright, Betty. We're   |
           |          | lost.                       |
           +----------+-----------------------------+
       
       Source Idea
       ~~~~~~~~~~~
       
       Here's a potential source snippet leading up to the above::
       
           ada = Actor('Ada', player=True)
           betty = Actor('Betty', player=True)
           dracula = Actor('Dracula')
           passage = Thing('to car')
           welcome_scene = Scene('0:00', map='dracula_floor_1', audio=ambient_creep,
                               title='Near Old House',
                               place={'A':ada, 'B':betty, 'D':dracula, '0':passage})
           betty.follow(ada)
           betty.say('0:01', "We're lost, Ada. Admit it!")
           ada.say("We're not lost. We're...")
           ada.say(1, "... Alright, Betty. We're lost.")
           ada.move_to('0:05', Scene.map.find('+'), proximity=3)    
           betty.choice("Dare Ada to lie about why we're here.",
                        "Say: We're computer technicians.",
                        "Say: We're here to suck his blood!",
                        "Say: We're pest control.",
                        pick=0, delay=0.5)
           betty.emote('smiles and looks at Ada.)
           ada.emote(0.2, 'squirms. "You have an idea.'
                            ' It's a bad one. That's your bad idea face.")
           betty.say("We should say we're here to suck his blood.")
           ada.say("What? No.")
           ada.say(0.2, "There's no reason he'd let us in if we said that.")
           betty.emote(0.2, 'nods. "You're right. We should do something else."')
           betty.say(0.1, "I know. I dare you to say we're computer technicians.")
           ada.say('What?')
           ada.say(0.5, "You're mean. You know that, right?")
           welcome_scene.wait(0.2)
           return welcome_scene
       
       ==========
       Reflection
       ==========
       
       It's interesting that nothing about my example actually needs the
       background track to be sample-precise with the visual. How important is
       that, really? Maybe this is something that's only really needed for the
       lyric tracks and when there's explicit syncronized timing.
       
       (For sample-precise timing to music, you might think of having a
       dedicated MIDI track for the action triggers. However, that's different
       than my above example.)
       
       Even the "real timeline" thing is a bit fuzzy. Scenes start with a real
       time that's used as an offset for timestamps mentioned in the scene,
       yes. But what I actually use in the example are mostly relative time in
       seconds.
       
       The given example has what could be a looping ambient track for the
       background. I think of it going silent and a knocking sound as part of
       the transition to the scene with the door open, but... I can also see
       long ambient tracks that fit multiple scenes.
       
       This means we'd need an `advisory_start` which would start audio within
       the file if you're jumping in to it, but let it flow naturally if you're
       starting at a previous scene. Ideally, this could be part of the next
       bit...
       
       Not all scenes will have fixed starting state. Sometimes state will
       depend upon previous state. We still need to jump to arbitrary scenes to
       aid in development. We can manage this by caching scene state at the end
       of scenes when this is needed.
       
       We could either always overwrite, or create a new file separate from
       the working file and make the developer manually overwrite. I favor
       always-overwrite, but user-overwrite would be more like traditional
       film. (I want fast and easy. Post-processing audio, as for traditional
       film, is neither of these things.)
       
       Roguelike games can easily have a dedicated region for text. My example
       above was narrow, but I think if it's a Roguelike aiming for 80+ by
       something 24 or greater is reasonable. Probably with three panes instead
       of whatever I was thinking above, one for map, one for dialog and feedback,
       and another for equipment or stats or even inventory.
       
       A design aiming after a GUI RPG allows us to have potraits, but turns
       back and forth dialog in to what is effecitvely a cut-scene. There's
       nothing wrong with that, but it's different work than the main stuff.
       
       Graphic RPGs will have smaller maps than the text-based games. Any
       graphic RPG game that uses a "minimap" of some sort does so because
       the primary view is pretty but doesn't convey enough information about
       where you are in relationship to your objectives. You see this less with
       third-person turn-based games than with first-person live-action games,
       but this is totally fine for our particular use-case. Huge, pretty tiles
       and a light-weight sketch of the neighborhood in a corner for flavor.
       
       If a show were to mostly have back-and-forth dialog, it should
       probably aim to feel more like a visual novel game and not an RPG.
       This would be lots of dialog with big portraits and usually some
       relationship-based questions.
       
       .. References (inline links when rendered)
       
       .. _Asciimatics: https://github.com/peterbrittain/asciimatics
       
       .. _Garry's Mod: https://gmod.facepunch.com/
       
       .. _Figlet: http://www.figlet.org/
       
       .. _MakeHuman: http://www.makehumancommunity.org/
       
       .. _PyGame: https://www.pygame.org/
       
       .. _The Sims: https://www.ea.com/games/the-sims
       
       ----
       
 (DIR) Category: Essays and Thoughts
       ------------------------------------------------------------------
       
 (DIR) Home
 (DIR) Phlog
 (DIR) Products
 (DIR) Categories
 (DIR) Keywords