(C) PLOS One This story was originally published by PLOS One and is unaltered. . . . . . . . . . . Bats expand their vocal range by recruiting different laryngeal structures for echolocation and social communication [1] ['Jonas Håkansson', 'Sound Communication', 'Behavior Group', 'Department Of Biology', 'University Of Southern Denmark', 'Odense M', 'Cathrine Mikkelsen', 'Lasse Jakobsen', 'Coen P. H. Elemans'] Date: 2022-12 Abstract Echolocating bats produce very diverse vocal signals for echolocation and social communication that span an impressive frequency range of 1 to 120 kHz or 7 octaves. This tremendous vocal range is unparalleled in mammalian sound production and thought to be produced by specialized laryngeal vocal membranes on top of vocal folds. However, their function in vocal production remains untested. By filming vocal membranes in excised bat larynges (Myotis daubentonii) in vitro with ultra-high-speed video (up to 250,000 fps) and using deep learning networks to extract their motion, we provide the first direct observations that vocal membranes exhibit flow-induced self-sustained vibrations to produce 10 to 95 kHz echolocation and social communication calls in bats. The vocal membranes achieve the highest fundamental frequencies (f o ’s) of any mammal, but their vocal range is with 3 to 4 octaves comparable to most mammals. We evaluate the currently outstanding hypotheses for vocal membrane function and propose that most laryngeal adaptations in echolocating bats result from selection for producing high-frequency, rapid echolocation calls to catch fast-moving prey. Furthermore, we show that bats extend their lower vocal range by recruiting their ventricular folds—as in death metal growls—that vibrate at distinctly lower frequencies of 1 to 5 kHz for producing agonistic social calls. The different selection pressures for echolocation and social communication facilitated the evolution of separate laryngeal structures that together vastly expanded the vocal range in bats. Citation: Håkansson J, Mikkelsen C, Jakobsen L, Elemans CPH (2022) Bats expand their vocal range by recruiting different laryngeal structures for echolocation and social communication. PLoS Biol 20(11): e3001881. https://doi.org/10.1371/journal.pbio.3001881 Academic Editor: Simon W. Townsend, University of Zürich, SWITZERLAND Received: June 24, 2022; Accepted: October 19, 2022; Published: November 29, 2022 Copyright: © 2022 Håkansson et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: All relevant data are within the paper and its Supporting Information files. Funding: This work was supported by the Villum foundation grant 00025380 and Danish Research council grant DFF 8021-00155A to LJ, and Danish Research council grant DFF 7014-00270 to CPHE. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors declare that no competing interests exist. Abbreviations: AP, anterior-posterior position; CSA, cross-sectional area; GVG, glottovibrogram; PTP, phonation threshold pressure Introduction The evolution of powered flight, echolocation, and subsequent fast buzzing allows bats to hunt and capture fast-moving airborne prey and thereby exploit the riches of the night: flying insects [1,2]. To detect small prey, biosonar signals need to contain high frequencies to provide efficient acoustic reflection and high bandwidth to provide high localization accuracy and spatial resolution [3]. Echolocation thus selects for increased fundamental frequency, f o and expansion of the f o range, and many species of bats (FM bats) produce precisely timed, frequency-modulated echolocation calls that sweep in f o from as high as 125 kHz down to approximately 10 kHz in calls of only 1 to 2 ms duration [3–5]. Some species have calls with f o up to 250 kHz [6], and, as such, bats produce the highest known voiced f o of all mammals. In addition, many bat species produce social communication calls [7] of which some extend their f o range further down to 1 kHz [8,9]. Thus, bats can produce very diverse signals that span an impressive f o range of 1 to 120 kHz or 6 to 7 octaves, while humans and other mammals typically only produce 3 octaves and in exceptional cases 4 to 5 [10]. The tremendous vocal range of bats is unparalleled in mammalian sound production, but how bats achieve this remains unknown. Bat calls are produced laryngeally as in most mammals [11,12], but in bats, the vocal folds exhibit several adaptations compared to the generalized mammalian vocal fold, likely associated with echolocation requirements [13–16]. First, the paired vocal folds end in 6 to 10 micrometer thin, apical vocal membranes [15]. Such membranes have been reported in bats, cats, and nonhuman primates and have been suggested to act as low-mass oscillators that can vibrate almost independently of the vocal fold proper, and thereby support the production of high-frequency vocalizations [17–19]. In marmosets, the first direct observation of vocal membrane vibration showed that they can indeed vibrate at frequencies up to 9 kHz [19]. However, in bats, we lack direct observation of vocal membrane vibration and their function in vocal production remains untested. Second, a second smaller apical membrane points downwards from the ventricular folds, the ventricular membrane [15]. In general, the ventricular folds have received very little attention in comparative bioacoustics, and while their altered geometry in bats suggests function, this remains untested. A combination of different laryngeal structures may serve to facilitate the tremendous f o range and different call types bats produce. Indeed, other mammals, such as marmosets can switch from vocal fold to vocal membrane vibration over postnatal development [19]. In humans, ventricular folds play a role in several low-frequency forms of singing, such as death metal grunting and Tuvan throat singing, where they can touch the vocal fold and increase the mass of the oscillating structures [20]. This results in a much lower f o than can be achieved by the vocal folds alone [20,21]. Additionally, human vocal folds can exhibit different oscillation regimes in the different voice registers, such as vocal fry, chest, and falsetto that expand the vocal range [22,23]. In an excised bat larynx preparation, abrupt changes in acoustics were attributed to such register jumps [11]. However, no direct evidence exists as to if and how laryngeal and ventricular structures can vibrate to produce sound in bats due to challenges of imaging the vocal folds in vivo at the extremely high speeds required. Here, we test the hypothesis that specialization of different laryngeal structures supports the extreme frequency range of FM bats. We test this hypothesis in Daubenton’s bats (Myotis daubentonii) that have an extreme 77 octave f o range from 1 to 95 kHz [9,24]. Discussion By filming the bat larynx in vitro with ultra-high-speed video up to 250,000 fps and using deep learning networks to extract vocal membrane motion, we provide the first direct observations that vocal membranes exhibit flow-induced self-sustained vibrations to produce echolocation calls in Daubenton’s bats. Furthermore, we show that both vocal membrane and ventricular folds vibrate to produce sound and at distinctly different frequency ranges. The vocal membranes generate 10 to 70 kHz high frequencies in the echolocation and social call range, while the ventricular folds produce 1 to 5 kHz low-frequencies in the range of agonistic social calls. Mammalian vocal membranes have been hypothesized to serve 3 specific purposes [18] that we can now test experimentally on bats. Firstly, vocal membranes supposedly increase f o by uncoupling the vocal membrane vibration from vocal fold vibration. Our data confirm high-frequency vocal membrane vibration and the in vitro range f o without CT modulation (10 to 20 kHz) correspond well to the in vivo range of 8 to 20 kHz after bilateral ablation of the superior laryngeal nerve in E. fuscus [14]. In contrast to vocal membranes in marmosets [19], we observed that bat vocal membranes vibrated completely uncoupled from the vocal folds and did not observe any vocal fold motion at all. Second, vocal membranes can supposedly reduce the PTP and thereby increase vocal efficiency. Our experimental data contradicts these model-based suggestions. The vocal membranes had a PTP of 3.22 ± 1.41 kPa in vitro, which compares well to PTP in vivo 2.5 to 4.0 kPa in E. fuscus [14]. This species is twice the weight of M. daubentonii and thus its PTP may deviate from M. daubentonii. However, when comparing across mammals, such PTP values are, if anything, on the high side and certainly not lower. The unsteady aerodynamic conditions required to initiate vocal membrane vibration are fascinating. Low Reynolds number airfoils show peaks in drag and lift coefficients due to rapid acceleration of relative airspeed [32], which are preceded by the maximum acceleration points [33] in a manner that mirrors the pressure speed profiles preceding the vocal membrane vibration onset in this study. Although the flow conditions are different and our observations are preliminary, they emphasize the need for further investigation of the role of unsteady aerodynamics effects in bat vocalizations. Thirdly, the vocal membranes supposedly support the production of broadband chaotic signals via increased oscillatory coupling [18]. Our data does not support this hypothesis in bats either. We did not observe mechanical coupling between vocal folds and vocal membranes, and although we did not quantify this specifically, we did not observe deterministic chaotic signals. The role of the peculiar ventricular apical membranes remains unclear. The ventricular and vocal membranes form a drumhead with a narrow slit over the ventricle of Morgagni [15], this configuration opens to the hypothesis that the ventricle of Morgagni acts as a cavity that generated a shallow cavity whistle [26] for echolocation calls. However, our data clearly shows that removing the ventricular folds and membranes—and thereby the ventricle—results in high-frequency sounds by vocal membrane oscillation. Therefore, they were not essential for sound production, but this does not exclude that they play a role. Perhaps, the ventricular membranes are coupled to vocal membrane oscillation during echolocation calls. Unfortunately, we could not directly observe the ventricular membranes in our experiments as they were either obscured by the ventricular folds and removed—together with the ventricular folds—when observing the vocal membranes and folds. Direct observations in vitro could involve a hemilarynx experiment, where the larynx is halved and closed by a glass plate through which the oscillations can be observed [34]. Anatomical adaptations in the bat larynx, such as the ossified cricoid and thyroid in combination with hypertrophied muscles, are purportedly adaptations to high pressures in the larynx during sound production [25]. However, an acoustic pressure of maximally 200 Pa (= 140 dB re. 20 μPa) [35] and maximal 8 kPa bronchial air pressures [14] do not exert much stress on bony structures with tensile strengths in the MPa range [36]. Instead, we propose that the ossification results from a strong selection on these structures to reduce weight while maintaining structural strength. The superfast CT muscles can power the rapid motion needed during feeding buzzes, but their speed trades off with force [2,37,38]. As a result, superfast muscles are exceptionally weak [37] and produce over 50 times lower tetanic stresses compared to normal skeletal muscles [39]. Muscular hypertrophy can partially compensate for the low area-specific force of superfast vocal muscles in bats [2] as it increases the cross-section area and thus the total force. Taken together, we propose the evolutionary scenario that many laryngeal morphological adaptations in echolocating bats are the result of selection for producing (1) high-frequency and (2) rapid echolocation calls to catch fast moving prey. This scenario would be concurrently followed by a complimentary specialization of the auditory system that affords bats sensitive hearing at high frequencies and over a wide frequency range [40]. First, a strong selection to increase spatial resolution [3] led to an increase in f o by reducing the mass of the vibrating vocal membranes. Second, a strong selection to increase call repetition rate led to very low muscle force [2]. The reduced force was compensated by higher cross-sectional area (CSA), i.e., a hypertrophied muscle, and the actuated mass was reduced to require less force: The vocal folds reduced in mass and both thyroid and cricoid reduced in size and became ossified to withstand large bending moments during acceleration. Lastly, the reduced thyroid was replaced by the cricothyroid membrane to have a flexible, airtight trachea. Taken together, these adaptations allowed the production of ultrasonic calls with fast FM that could be repeated above 200 Hz for catching erratic airborne prey in the dark. The vocal membranes achieve unparalleled high voiced f o in bats. However, the vocal range of vocal membrane produced echolocation calls with 10 to 95 kHz in Daubenton’s bat is only 3 to 4 octaves and thereby comparable to other mammals [10]. When considering only vocal membrane produced sounds, we expect the vocal range for all bats to fit within 3 to 4 octaves. As a consequence, we do not expect the material properties of the vocal membranes to be significantly different. However, because smaller strains in muscles allow faster motion, an increased stiffness would require a smaller range of motion to achieve the same vocal range [10]. Therefore, a stiffer vocal membrane would allow faster FM and call repetition rates at the same frequency bandwidth, but this remains to be tested. There is only limited known ways to lower f o for mammals. First, vocal folds can exhibit different vibratory patterns, aka registers, due to differential posturing by laryngeal muscles [22,23]. In humans, the lowest register is the vocal fry register. The excised horseshoe bat larynx produced distinctly different frequencies that were suggested to be different registers [11], but no laryngeal dynamics were measured to confirm this. In contrast, our data suggest that in FM bats, echolocation calls and agonistic social calls are not caused by different vocal membranes registers, but by using different laryngeal structures. The mechanism by which ventricular folds decrease f o in other mammals is by coupled oscillation to vocal folds, as in tigers [41], grunting pigs [21], human throat singing [42], and metal growling [43]. In our preparation, we did not see vocal fold vibration in any condition and were not able to observe vocal folds during ventricular fold oscillation. As such, we cannot be conclusive that the lower f o is the result of mechanical coupling between laryngeal structures. However, because we could not get the vocal folds to oscillate, we venture to speculate that in bats, the ventricular folds have taken on the role of lower frequency vibrations. An additional effect of high f o is highly directional sound emission, i.e., sound pressure attenuates rapidly at angles away from the main broadcast axis. This has substantial benefits for navigation through echolocation [44], but likely becomes disadvantageous for social communication as the sender generally wishes to broadcast as broadly as possible depending on the context [45]. Thus, there likely is a strong opposing evolutionary drive for echolocation calls versus social calls. Echolocation favors high frequencies for spatial resolution and high directionality, while communication favors low frequencies for low directionality and low atmospheric attenuation. This duality may then have facilitated the evolution of separate vocal sub-structures with distinctly different sound producing purposes in bats. Likewise, fruit bats of the genus Rousettus echolocate by tongue clicks and communicate via laryngeal sounds [46], indicating a similar duality between echolocation and social call production. Together, the different mechanisms vastly expand the vocal range in bats and provide a rich substrate for vocal communication. Materials and methods Subjects We used the larynges of 8 adult specimens of M. daubentonii in total (6 males, 2 females). Animals were caught under license 2020–9239 from the Ministry of Environment. Animals were housed in bat keeping facilities at 11L:13D photoperiod at approximately 22°C and 60% relative humidity. All experiments were conducted at the University of Southern Denmark and were in accordance with the Danish Animal Experiments Inspectorate (Copenhagen, Denmark). Larynx dissection and preparation All animals were euthanized with isoflurane (Baxter laboratories). The trachea, larynx, and surrounding tissue were dissected in ice-cold oxygenated buffer (150 mM NaCl, 2.5 mM KCl, 4 mM CaCl2, 1 mM NaH2PO4, 1 mM MgSO4, 10 mM HEPES, 12 mM Glucose, pH 7.4 adjusted with a 1 M Trizma solution). Five specimens (MD10, MD11, MD21, MD22, and MD23) were flash-frozen in liquid nitrogen and stored at −80°C. Two specimens (MD13 and MD14) were used fresh in the setup described below. For 1 specimen (MD12), the larynx was transferred to a sylgard-covered petri dish on ice for inspection under a stereomicroscope (M165-FC, Leica Microsystems). This specimen was then also flash-frozen in liquid nitrogen and stored at −80°C. Later, this specimen was thawed and fixed in 4% PFA on a roller for cross-sections. Before an experiment, we thawed the tissue in a refrigerator and then submerged it in refrigerated ringer’s solution in a dish on ice and removed additional tissue surrounding the larynx and trachea. We then mounted the larynx on a rounded, blunted 21G needle (Sterican, 0.8 × 40 mm). The larynx was slid over the blunt needle until the caudal edge of the cricoid touched the tube exit and secured with a 10 to 0 monofilament suture (AroSurgical Instruments, California, United States of America) around the trachea. Experimental setup We mounted the larynges in the excised larynx setup described previously [26,27]. The setup allows for running humidified air through the larynx at precisely controlled pressures (model PCD, Alicat Scientific) while controlling the configuration of the larynx with micromanipulators and recording any sound produced. For recording the sound, we used a 1/4-inch pressure microphone-preamplifier assembly (model 46BD, frequency response ± 1 dB 10 Hz to 25 kHz and ± 2 dB 4 Hz to 70 kHz, G.R.A.S., Denmark). The positions of the larynx and microphone were fixed relative to each other during an experiment and placed horizontally at 22 to 44 mm from the larynx. The microphone signal was amplified (12AQ, G.R.A.S., Denmark) and calibrated before each experiment (Calibrator 42AB, G.R.A.S., Denmark). The sound, pressure, and flow signals were low pass filtered at 100, 10, and 10 kHz, respectively (filter model EF502 low pass filter DC– 100 kHz and EF120 low pass filter DC– 10 kHz, Thorlabs, USA), and digitized at 250 kHz (USB 6259, 16 bit, National Instruments, Austin, Texas, USA). To capture the laryngeal configuration during the experiments, we used a Leica DC425 camera mounted on the stereomicroscope, controlled using LAS (Leica Application Suite Version 4.7.0, Leica Microsystems, Switzerland). To record tissue vibration, we used a high-speed camera (FASTCAM SA1.1, Photron, Tokyo, Japan) filming at 10,000 to 20,000 fps for ventricular folds and 100,000 to 250,000 fps for vocal membranes, controlled by Photron FASTCAM Viewer 4. For illumination, we used a Leica GLS150 lamp through a liquid light guide connected to the stereomicroscope (static images) or a Thorlabs plasma light source (HPLS200 Series) (high-speed-imaging). All control and analysis software were written in MATLAB (MathWorks). Excised larynx phonation protocol We removed the epiglottis to give an unobstructed view of the ventricular folds and make adduction of the arytenoids easier. To induce ventricular fold vibration, we applied a linear increase in bronchial pressure from 0 to 6 kPa at a speed of 1 kPa/s. We wanted to minimize the amount of air flowing over the delicate laryngeal structures to prevent them from drying out. Because the PTP values were rather high, we did not always start at 0 kPa, but sometimes at 3 kPa. Ventricular fold vibration was induced in 4 larynges (MD10, MD11, MD13, and MD23). We then turned on the plasma light source and repeated this ramp while triggering the camera when the pressure was passing the PTP. In 3 of these (MD11, MD13, and MD23), we successfully filmed their vibration. To expose the vocal membranes, we carefully cut in a horizontal plane between the ventricular and vocal membranes with adventitia scissors (S&T surgical instruments, Switzerland) through the ventricle of Morgagni. To induce their vibration, we applied a slow pressure ramp from 0 to 7 kPa at 1 kPa/s. This type of pressure function only yielded oscillation for 1 out of the first 4 individuals, and we did not apply it for the last 2 to minimize experimental time. Next, we applied a sequence of 4, 300 ms duration fast pressure modulation between 0 and 4 kPa. This readily resulted in oscillation in 5 specimens (MD10, MD11, MD13, MD14, and MD23). Because we needed to film at rates up to 250,000 fps, we only had short buffer available and sometimes needed several runs to trigger the camera during vocal membrane vibration with correct lighting conditions. We successfully filmed vocal membrane oscillation in 4 animals. To increase the f o of the vocal membrane vibrations, we mimicked cricothyroid muscle contraction. We applied 5 to 7 kPa pressure for 1.5 seconds and manually rotated the thyroid downward to increase the tension of the vocal fold and membrane in 5 individuals (MD11, MD13, MD14, MD22, and MD23). Since the yin algorithm tends to fail for f o ’s above 1 quarter of the sampling rate [47], we instead extracted them using the time frequency ridge detection function in MATLAB (tfridge) on spectrograms of the sound signal (nfft = 2,048, overlap = 50%, Hamming window) [26]. Glottovibrogram construction Each video was rotated to make the glottal midline vertical and cropped around the glottis. We then calculated the opening of the vocal folds as a function of anterior-posterior position (AP) and time, i.e., the glottovibrogram (GVG), by automated detection of the glottis shape per image. For the ventricular folds, the glottis was defined as all pixels below a manually set threshold gray value. The resulting logical image was horizontally and vertically dilated with a 2-pixel line (imdilate function in MATLAB) and filled (imfill), which resulted in an outline of the glottis. The glottis width was the sum of the vertical opening pixels scaled for magnification. To determine the position of the vocal membrane edges, we could not use a simple image grayscale threshold, because the vocal membranes were too translucent, and the trailing edge was crossing the underlying vocal folds with nearly the same pixel values. This led to erroneous detection of the thin vocal membrane parts as glottis. Instead, we trained a deep learning model to detect the vocal membrane edges using the deep learning Python package DeepLabCut (2.2b) [48,49]. We digitally superimposed 8 to 10 equidistantly spaced dashed horizontal lines on the videos and trained the network on detecting where the vocal membrane edges crossed these lines. The superimposed lines were used to fix the detections vertically as we were only interested in the horizontal movement of the vocal membranes. After training for 1 million iterations, the videos were analyzed, resulting in pixel coordinates for points along the glottal edge for each analyzed frame. To calculate the f o , we first determined the anterior-posterior (AP) location where the mean opening was maximal. Then, we extracted the opening at this location along the AP axis from the GVG. We resampled all other physiological signals (pressure, sound) to the framerate of the video (resample function in MATLAB). The f o of the sound and glottal opening signal was determined using the yin algorithm [47], combining signal power and aperiodicity criteria to extract f o per 10 frames. Signal analysis To determine PTP and S ptp , we first low pass filtered the pressure signal at 500 Hz with a sixth order Butterworth filter (butter and filtfilt functions in MATLAB) to eradicate any high-frequency fluctuations. The rate or speed of the pressure change was then calculated by first finding the pressure change between time steps (diff function in MATLAB), this value was then multiplied by the acquisition rate (250 kHz) to get the pressure speed (per second rate of pressure change). We defined PTP and S ptp as the pressure and pressure speed at the time where the sound power crossed 0.2 mPa. In vivo social call recordings Because we could not find detailed quantification of the low-frequency calls of M. daubentonii in the literature, we recorded 9 additional males in Odense, Denmark caught under license 2021–1194. Daubenton’s bats do not spontaneously produce low-frequency calls as easily as, e.g., Pipistrellus pygmaeus, and only 3 individuals produced such calls when (1) they were joined with others into 1 enclosure after daily weighting; or (2) when stroked roosting in the large flight cage at SDU. We recorded calls with an Olympus LS-100 24-bit recorder at sampling rate of 96 kHz and a Grass 40BF ¼” microphone connected to a Avisoft 16-bit USG at 375 kHz. We selected small segments that included calls and extracted the f o of the sound with the yin algorithm [47]. Statistics All values listed are mean ± SD. The correlation between the f o of sound and vocal fold vibrations was established with linear regression (regress function) in MATLAB (MathWorks). The boxplots were constructed using the MATLAB toolbox IoSR (v.2.8, Institute of Sound Recording, University of Surrey, 2016), with no limit for outliers, meaning horizontal lines indicate minimum, maximum, median, and interquartile range. Acknowledgments The authors would like to thank Danuta Wisniewska and Carl Lyons for assistance in catching bats. [END] --- [1] Url: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001881 Published and (C) by PLOS One Content appears here under this condition or license: Creative Commons - Attribution BY 4.0. via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/