[HN Gopher] Show HN: RoboPianist, a piano playing robot simulati...
       ___________________________________________________________________
        
       Show HN: RoboPianist, a piano playing robot simulation in the
       browser
        
       Author : kzakka
       Score  : 124 points
       Date   : 2023-03-30 17:22 UTC (5 hours ago)
        
 (HTM) web link (kevinzakka.github.io)
 (TXT) w3m dump (kevinzakka.github.io)
        
       | Waterluvian wrote:
       | I'm so excited and curious about this that I can't even structure
       | my thoughts well. They're just flooding my brain.
       | 
       | What part of this is pre-coded? What part is being generated? Is
       | the goal to give a program some sheet music (maybe a MIDI file)
       | and it figures out the fingerings[1] and then translates the
       | fingerings into kinematics?
       | 
       | Because if that's the goal... Holy forking shirtballs that would
       | be amazing. One of the trickiest things for me as a novice
       | pianist is figuring out the fingerings to a piece. It's like a
       | puzzle you work at until you've figured out what's comfortable.
       | It's all about lookahead. "This section generally goes down so I
       | probably want to begin with my pinky and not my thumb."
       | 
       | And if it got really good at that, not only are the fingerings
       | useful, but maybe we could get feedback on how physically
       | demanding a piece is. Another challenge I've discovered as a
       | novice is that it can be surprisingly tricky to look at sheet
       | music or hear a piece and determine if it's as easy as it sounds.
       | Some pieces require some very complex fingering.
       | 
       | [1] what pianists call the determination of what fingers go
       | where, not just to play certain notes together, but to ensure you
       | can fluidly and comfortably play the next notes as well.
        
         | holri wrote:
         | Piano fingering is a very subjective matter, because every
         | hand, finger, arm and taste is different. I doubt that a robot
         | fingering is of any use. You have to find it out for yourself
         | under guidance of a teacher and maybe inspiration by fingerings
         | of master pianists written in the score. The Henle app has a
         | feature to show them separately. They differ vastly for the
         | same piece.
        
         | kzakka wrote:
         | Hi, one of the authors here :)
         | 
         | The demo you are watching is an agent trained from scratch with
         | reinforcement learning. It has roughly 6 days of experience
         | (10M steps at 20 Hz). The Javascript demo is replaying the
         | policy open loop which is why it's not super robust to
         | disturbances.
         | 
         | Re:fingering: we actually use fingering information to create a
         | dense reward for the agent (otherwise it makes exploration
         | super hard). It would be an exciting future direction to have
         | the agent discover and optimize for fingering that best suits
         | its kinematics :) And beyond that, having RL inform pianists
         | about the difficulty of a piece or even more optimal fingering
         | would be amazing.
         | 
         | We trained a bunch of these policies on roughly 150 songs
         | (baroque, romantic, classical) and we did some analysis in the
         | paper if you're interested:
         | https://kzakka.com/robopianist/robopianist.pdf
        
           | Waterluvian wrote:
           | Yes! Thank you. This paper is exactly what I needed.
        
           | gus_massa wrote:
           | Nice project! Anyway, one of the unrealistic details is that
           | the robot in the simulation curls the fingers when it is not
           | using them. In particular the pinky finger. Can that be fixed
           | in a future version? For comparison, I got this as the first
           | result in Google https://www.youtube.com/watch?v=cGYyOY4XaFs
           | 
           | It's also strange that all fingers are always parallel, but I
           | guess that adding that freedom makes the search space huge.
        
             | zeitgeistcowboy wrote:
             | I don't think the intention of this simulation is to be
             | realistic. This particular agent just learned to play the
             | music it was reinforced to learn given the physics
             | constraints programmed for the hand mechanics (as far as I
             | understand it). I doubt the physics emulate our human hands
             | very accurately so I wouldn't expect it to be "realistic"
             | or something that needs to be "fixed" unless the specific
             | intention was to optimize actual human hand movements.
        
               | kzakka wrote:
               | Yup, we're not trying to mimic human movements exactly
               | but rather optimizing for the reward given the robot
               | hardware. Fun fact, we do things like add an energy
               | penalty to try and reduce jitteriness / un-human like
               | movements and it does help enormously.
        
           | jacquesm wrote:
           | Fingering is harder than it seems, especially once you start
           | to take into account speed, fingerings that work when playing
           | slow may not necessarily work when playing fast. And
           | individuals have different hand spans so a fingering that
           | works for one person may not work for another.
           | 
           | If you crack this in a deterministic way it would be super
           | useful as a library.
        
             | Waterluvian wrote:
             | Well when you put it like that, it truly does sound like an
             | absolutely delicious problem to tackle.
        
           | dfan wrote:
           | This is really cool!
           | 
           | There are two motions in particular that pianists use
           | constantly that don't seem to be represented in the robot
           | model, if you're looking to get closer to the way that human
           | limbs and digits operate. (Naturally there are plenty of
           | other goals, but if you can imitate human playing you can do
           | things like suggest fingerings or assess difficulty, as you
           | say.)
           | 
           | 1) turning at the elbow (so that your forearm can make an
           | angle with the piano keyboard instead of always being
           | perpendicular to it). It looks like you translate the forearm
           | back and forth instead, which I assume must be a lot easier
           | to handle because of course it's not how human arms work.
           | 
           | 2) rotating the forearm/wrist (like turning a doorknob).
           | Pianists do this on basically every note to a greater or
           | lesser extent. To take an extreme example, if you alternate
           | notes with your thumb and pinky you are almost completely
           | using your wrist and not your fingers. Without this degree of
           | freedom it is not really possible to emulate a competent
           | pianist, if that is one of the eventual goals.
        
             | kzakka wrote:
             | Thanks! We did indeed explore these additional degrees of
             | freedom, you can find vestigial code for this here:
             | https://github.com/google-
             | research/robopianist/blob/main/rob...
             | 
             | We ended up picking a minimal subset of forearm DoFs that
             | wouldn't impact training speed too much.
        
           | alana314 wrote:
           | This is insanely impressive. For fingering, in the right hand
           | I typically put my pinky on the highest note for a phrase, it
           | feels more comfortable and you can accent it more than the
           | middle fingers. In the left hand I typically put the bass
           | note in the pinky as well. The middle fingers aren't as
           | dextrous so I use them less, though a concert pianist could
           | probably use your fingering. Overall technique wise, human
           | hands cup their hands more, the palm is arched where the
           | robot's is flat. But who says it needs to model humans
           | exactly. I can't believe this is working in three.js! Amazing
           | work!
        
             | alana314 wrote:
             | Here's some fingering for Turkish march
             | https://musescore.com/user/73797/scores/142975
        
         | [deleted]
        
         | rideontime wrote:
         | clicking the link at the top of the page may help explain :)
         | https://github.com/google-research/robopianist/
        
           | Waterluvian wrote:
           | I had. It really doesn't help much.
        
           | ksherlock wrote:
           | That URL also explains why it only works on chrome.
        
             | kzakka wrote:
             | Hi author here! The app should work on mobile/desktop and
             | was tested on both Safari and Chrome. I've heard it's buggy
             | for some people (unclear if it's an older hardware problem)
             | but you can try this embedded demo which works better:
             | https://kzakka.com/robopianist/#demo
        
       | patrakov wrote:
       | Bug: if the WebGL is too slow (e.g. Haswell IGPU at 4K
       | resolution), the song stalls completely. It should still play
       | with reduced fps for rendering. Resizing the window down lets it
       | proceed. Maximizing it again stalls again.
        
       | abrichr wrote:
       | Excellent, thank you for making this open source!
       | 
       | I had been playing with the idea of creating a browser-based
       | virtual piano for when I'm travelling and don't have access to a
       | real piano but have my laptop with me. The idea would be to point
       | the webcam down at the table between me and the laptop, and play
       | on the table as if a piano were there. Then use the mediapipe
       | framework [1] to capture finger positions, and use those to
       | update a virtual environment like the one you have here.
       | 
       | I put it on hold due to the significant engineering required, but
       | it seems you have already implemented (and open sourced!) the
       | browser-based piano simulation component.
       | 
       | A quick scan through your repo indicates that this is all
       | implemented in Python. I see that you are using mujoco_wasm [2].
       | Can you please comment on what is required to compile your
       | project to work in the browser?
       | 
       | Thank you again!
       | 
       | [1] https://google.github.io/mediapipe/
       | 
       | [2] https://github.com/zalo/mujoco_wasm
        
       | [deleted]
        
       | reffaelwallen wrote:
       | Please add sustain pedal as well, you will get 10x positive
       | reactions
        
         | Traubenfuchs wrote:
         | I feel like the sustain pedal is to the piano what the beauty
         | filter is to facebook.
        
         | kzakka wrote:
         | We have the sustain pedal implemented in the standalone MuJoCo
         | simulation, e.g. see
         | https://www.youtube.com/watch?v=VBFn_Gg0yD8. I just couldn't
         | figure out how to do it with Tone.js :(
        
       | ShadowBanThis01 wrote:
       | Pretty interesting. Just a heads-up: On my system, in desktop
       | Safari, the simulation stalled almost immediately. Upon reload,
       | the fingers moved but no sound came out. After I turned up the
       | volume, I opened the control panel and suddenly the sound started
       | working and blasted me out. Beware!
       | 
       | After that it seemed to work OK.
        
       | thallium205 wrote:
       | Pianist hands and fingering don't really look like that while
       | playing.
        
         | londons_explore wrote:
         | I suspect these simulated keys have no weight/spring back
         | force. Ie. the robot hand doesn't have to 'push hard' to make
         | them go down.
         | 
         | It would be equivalent to you trying to play a holographic
         | piano in the air in front of you. I suspect you'd adopt a very
         | different hand position too.
        
           | kzakka wrote:
           | We're using the MuJoCo simulator under the hood:
           | https://mujoco.org/
           | 
           | The keys are actually implemented using a spring mechanism,
           | but springs in MuJoCo are currently linear, which isn't the
           | case in the real world.
        
         | sfblah wrote:
         | I can play the songs on the list, and the fingering is
         | completely wrong.
        
           | ubj wrote:
           | Fingering isn't a fixed "right" or "wrong" matter. Every
           | pianist's hands are different, so fingerings may vary from
           | person to person.
        
           | kzakka wrote:
           | The fingering we use is taken directly from: https://beam.kis
           | arazu.ac.jp/~saito/research/PianoFingeringDa...
           | 
           | From their whitepaper: "Fingerings in the dataset were
           | provided by experienced pianists who graduated from a music
           | college or who had played the piano for more than twenty
           | years. The pianists were asked to choose pieces that they
           | could play and provided the fingering that they had actually
           | used for the performance."
        
         | londons_explore wrote:
         | I think it's because the 38 muscles they have given their robot
         | hand don't map perfectly with real muscles in a real hand.
         | 
         | In particular, there seems to be not much ability to move each
         | finger left or right. For example, actuator 14 lets the baby
         | finger swing outwards - but now try to stretch your baby finger
         | out, and you see it can swing much further than this robot
         | finger can swing out, and the pivot is closer to your wrist,
         | allowing more reach.
        
       | samstave wrote:
       | _Computer_
       | 
       |  _I say, Computer?_
       | 
       |  _Uh. you just have to use the _keyboard__
       | 
       |  _ah, a keyboard, how quaint!_
       | 
       | -
       | 
       | How many words a minute can it type?
       | 
       | and while the refinements in movement finess and control are
       | obviously a needed thing, out of ignorance, what other abilities
       | will this allow?
       | 
       | I assume that it will allow for a much more finessed touch
       | control of hands/digits, and, coupled with sensors, as they
       | evolve, be able for much more fine crafts - such as embroidery?
        
       | green_man_lives wrote:
       | I don't really have a background in ML or anything but how
       | generalizable is this? Is the goal to have a trained model for
       | each specific using this framework? It's pretty amazing, I love
       | seeing anything robotics related in the browser.
        
       | telesilla wrote:
       | Oh dear, it's like listening to a 12 year old practicing before
       | their lesson. But hey! Well done Robot for the achievement. I bet
       | in no time you'll be https://youtu.be/e37NxUtFQSo. Good luck with
       | the training!
        
       | huhtenberg wrote:
       | Whoopses shortly after a load with                   Uncaught
       | Error: buffer is either not set or not loaded             ti
       | https://unpkg.com/tone@14.7.77:1             start
       | https://unpkg.com/tone@14.7.77:21             triggerAttack
       | https://unpkg.com/tone@14.7.77:21             triggerAttack
       | https://unpkg.com/tone@14.7.77:21             processPianoState
       | https://kevinzakka.github.io/robopianist-
       | demo/examples/main.js:174             render
       | https://kevinzakka.github.io/robopianist-
       | demo/examples/main.js:255             onAnimationFrame
       | https://kevinzakka.github.io/robopianist-
       | demo/node_modules/three/build/three.module.js:27951
       | onAnimationFrame https://kevinzakka.github.io/robopianist-
       | demo/node_modules/three/build/three.module.js:12661
       | 
       | This is in a recent Firefox.
        
         | rzzzt wrote:
         | I needed to reload the page once and click on it to start. It
         | might have to do with autoplay blocking.
        
           | kzakka wrote:
           | Indeed, browsers are pretty aggressive about not autoplaying
           | sound, you need to interact with the screen (mouse click or
           | finger tap + make sure your sound is up / ringer not on
           | silent).
        
       ___________________________________________________________________
       (page generated 2023-03-30 23:00 UTC)