[HN Gopher] YOLOv7: Trainable Bag-of-Freebies ___________________________________________________________________ YOLOv7: Trainable Bag-of-Freebies Author : groar Score : 46 points Date : 2022-07-16 19:27 UTC (3 hours ago) (HTM) web link (arxiv.org) (TXT) w3m dump (arxiv.org) | SrslyJosh wrote: | > the highest accuracy 56.8% AP among all known real-time object | detectors with 30 FPS or higher | | Yikes. It's not clear to me if that's the upper limit on accuracy | or a limit imposed by requiring that it run at 30 FPS, but | still...yikes. | JustFinishedBSG wrote: | It's clearly the latter and I don't see why it would be | "yikes". Real time detectors are useless if "real time" means | 1fps. | SrslyJosh wrote: | What good is speed if the accuracy isn't significantly better | than a coin flip? | | From the paper: | | > For example, multi-object track- ing [94, 93], autonomous | driving [40, 18], robotics [35, 58], medical image analysis | [34, 46], etc. | | LOL, these are all great use cases for a model with < 60% | accuracy! | IncRnd wrote: | In YOLOv7, YOLO and v7 don't go well together. No, not at all. | YOLO normally means "You Only Live Once", and v7 means it's lived | at least six times before this. | | While the author likely didn't have that intention, that's what | came across. | | Even for YOLO meaning "You Only Look Once" YOLO and v7 do not go | together well. | gchq-7703 wrote: | YOLO in this case stands for "You Only Look One". | IncRnd wrote: | Yes. | | The point I was making is that YOLO and v7 don't go well | together, and that is true for either meaning of YOLO. | Dayshine wrote: | Huh? It means that the approach is to only process the | input image frame once, I.e. "look". And this is the 7th | implementation of that algorithm. | | It's not as if this is named "the final algorithm v7" | isoprophlex wrote: | Github repo mentions "teaser: Yolov7-mask" showing segmentation | as well. Highly relevant to my interests. Sadly I can't easily | discern any other info on this topic. | | Anyone knows any more, maybe? | hwers wrote: | What are you using it for if can share? I've thought about | training some of these and releasing the weights but I've never | found a reason they'd really be useful personally so it never | really happened | kylevedder wrote: | Probably the most interesting trick from the paper is using the | head as a soft supervisor for earlier layers of the network, with | the intuition being that if the earlier layers learn to imitate | the higher capacity later layers, it frees up the capacity of the | later layers to better learn the residual and provides more dense | supervisory signal. | squarefoot wrote: | As someone who got only his feet wet with OpenCV like 20 years | ago, so basic shape recognition and no AI involved, what | read/software, etc. would you suggest to catch up and play with | current technology without being inundated by theory that I'm | sure I couldn't grasp? | anewpersonality wrote: | We should stop calling it YOLO after the creator quit machine | learning. | isoprophlex wrote: | Especially hilarious considering some other people ALSO jumped | on the "we made an object detector so let's call it YOLOvX" | wagon and released... | | Something called YOLOv7. | | https://github.com/jinfagang/yolov7 ___________________________________________________________________ (page generated 2022-07-16 23:00 UTC)