[HN Gopher] Real time image animation in opencv using first orde... ___________________________________________________________________ Real time image animation in opencv using first order model Author : abhas9 Score : 159 points Date : 2020-05-26 15:21 UTC (7 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | qchris wrote: | I'm a huge fan of this kind of practice, where the code for a | paper is all located in a single public repository with build | instructions, along with directions for how to cite it. | Obviously, it's a little tough to do with some more data- | intensive sources (besides GH hosting limits, no one really wants | to download 100G of data if they're just trying to clone a | repository), but this kind of thing sets a high standard for | reproducibility of published results. | dllthomas wrote: | > but this kind of thing sets a high standard for | reproducibility of published results. | | I think making the code available is good, but I think we | should be careful how we use the term "reproducibility". | Pulling your repo and running it had better give the same | results, but it's not the same sort of thing as building my own | experimental setup according to a paper's specification. The | latter gives more room for variability such that successful | replication speaks more strongly to the robustness of the | result, and also puts human brain power next to each step of | the process in a way where weirdness might be noticed. | | Replication should probably involve reimplementation, if it's | to carry its traditional weight. In the event that we fail to | replicate, though, having the source code for both versions is | likely to be hugely informative. | enchiridion wrote: | While I agree, I think reimplementation is high bar for most | research, especially in very niche areas. | | I think think extension also carries similar value. It is | less grunt work to do, but still requires a deep | understanding of the existing code. "Weirdness" should | quickly become apparent. | dekhn wrote: | I think this is a fair point, but in my experience, having a | concrete replica that people can start from (and compare to | the paper) can make a year's difference in speeding up | progress. | | Many times, I've read a paper, thought something was great, | and then implemented the paper and failed to reproduce the | author's results. In the cases where I've been able to | compare my implementation to a reference on github, I often | find the paper doesn't match the code, or a subtle data | processing step was left out. Having a replica (a commit hash | and a pointed to versioned input data) can often make a huge | difference in time. | dllthomas wrote: | Yeah, I'm certainly not saying it isn't advisable or even | important. I'm just saying it's not the same thing as | replication. | rozgo wrote: | It's incredibly satisfying to reproduce these papers. I now | make Rust versions of the most interesting projects. And try to | make low-latency inference pipelines for those that show | potential for real-time use. Some are sketched out here: | https://github.com/Simbotic/SimboticTorch | | The bulk of the work to get real-time working is to move more | of pipeline to GPU. Mostly things handled by numpy and some | image/video transformations. | qchris wrote: | That's awesome! If you're interested, there's a group working | on machine learning in Rust, including some working on doing | GPU pipelines for it at https://github.com/rust-ml/wg . I'm | not sure if any of the work being done right now is directly | applicable to any of the projects that you're reproducing, | but it might be worth a look! | rozgo wrote: | Oh, I'm interested. Thanks for letting me know. Would love | to contribute. | bsaul wrote: | How can it generate teeth that look like they fit the picture ??? | rozgo wrote: | This model is trained with short clips of human speech. There | is enough statistical information to "guess" how to fill the | gap created by opened lips. I'm still amazed how it conserves | temporal coherence (what it looks like from frame to frame). | mister_hn wrote: | Really cool, but I hoped to see C++ code for OpenCV, not python | [deleted] | egfx wrote: | Pretty cool. Reminds me of https://github.com/yemount/pose- | animator | | I would use it if there was a JavaScript port. | sriram_malhar wrote: | Is no one else deeply afraid of this future? | api wrote: | Provenance and chain of custody is everything. It's always been | important, but now its critical. Any audio or video without a | solid chain of custody is now suspect. Anonymous leaks are | worthless as anything can be faked by almost anyone with a PC. | | Old and busted: "pic or it didn't happen." | | New hotness: "in person witness or it didn't happen." | enchiridion wrote: | Do I smell a blockchain application? | api wrote: | Cue 10 ICOs for AuthenticityCoin type things, most of which | just exit scam and the rest of which don't actually work. | | The real security hole for forgery is at the point of | injection. Tracking a forgery along with a block chain | doesn't prove it's not a forgery. | | One thought is a camera sensor that cryptographically signs | (watermarks) photos or video frames _on the sensor_ before | they are touched by anything else. It 's not perfect since | a highly sophisticated adversary could get the secret key | out of the chip, but it could definitely make it quite a | bit harder to fake photos. Nothing is ever perfectly | secure. All security amounts to increasing the work | function for violating a control to some decent margin | above the payoff you get from breaking the control. | | I could see certified watermarking camera sensors being | used by journalists, politicians, governments, police, etc. | ivanstame wrote: | Me too man | echelon wrote: | I'm not at all. | | Deep fakes are just like Photoshop, but instead of pictures, we | can generate complex shapes in all sorts of signal domains. | | If you restrict the technology, it becomes the tool of state | actors. If it's wide open, it's just a toy. Society will learn | to accept it just as they did with Photoshop. | | I'm actually really excited by the potentials it unlocks. Our | brains are already capable of reading passages in other | people's voices and picturing vivid scenes without them ever | existing. Deep models give computers the ability to do the same | thing. That's powerful. It'll unlock a higher order of | creativity. | rozgo wrote: | I'm working with same model, but in a real-time pipeline | developed with GStreamer, Rust and PyTorch: | | https://twitter.com/rozgo/status/1255961525187235842 | | Live motion transfer test with crappy webcam: | | https://youtu.be/QVRpstP5Qws | mv4 wrote: | Nice. I want to try something like this. | forgingahead wrote: | Very cool, reminds me of Avatarify, which is also based upon the | First Order Model work: | | https://github.com/alievk/avatarify | roomey wrote: | It looks the same, even the same images. I can only get 3fps | from avatar that's with CUDA, is this one faster? ___________________________________________________________________ (page generated 2020-05-26 23:00 UTC)